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Abstract 

Quantum information processing is the emerging field that defines and realizes computing 
devices that make use of quantum mechanical principles, like the superposition principle, en- 
tanglement, and interference. Until recently the common notion of computing was based on 
classical mechanics, and did not take into account all the possibilities that physically-realizable 
computing devices offer in principle. The field gained momentum after Peter Shor developed an 
efficient algorithm for factoring numbers, demonstrating the potential computing powers that 
quantum computing devices can unleash. 

In this review we study the information counterpart of computing. It was realized early on 
by Holevo, that quantum bits, the quantum mechanical counterpart of classical bits, cannot be 
used for efficient transformation of information, in the sense that arbitrary fc-bit messages can 
not be compressed into messages of k — 1 qubits. 

The abstract form of the distributed computing setting is called communication complexity. 
It studies the amount of information, in terms of bits or in our case qubits, that two spatially 
separated computing devices need to exchange in order to perform some computational task. 
Surprisingly quantum mechanics can be used to obtain dramatic advantages for such tasks. 

We review the area of quantum communication complexity, and show how it connects the 
foundational physics questions regarding non-locality with those of communication complexity 
studied in theoretical computer science. The first examples exhibiting the advantage of the use 
of qubits in distributed information-processing tasks were based on non-locality tests. However, 
by now the field has produced strong and interesting quantum protocols and algorithms of its 
own that demonstrate that entanglement, although it cannot be used to replace communication, 
can be used to reduce the communication exponentially. In turn, these new advances yield a 
new outlook on the foundations of physics, and could even yield new proposals for experiments 
that test the foundations of physics. 
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1 Introduction 

1.1 Background 

During the last decades of the twentieth century it was realized that information processing at 
the quantum level could offer tremendous advantages over conventional "classical" information 
processing. Quantum information admits extremely efficient algorithms, such as Shor's factoring 
algorithm |123j . and qualitatively superior cryptographic protocols, such as the BB84 key distribu- 
tion protocol [T7]. Many other works contributed to put this field on solid foundations. Quantum 
error-correcting codes and fault-tolerant quantum computation showed that these beautiful ideas 
could in principle be realized experimentally. These codes, combined with Holevo's Theorem, Schu- 
macher compression, and entanglement distillation (which are analogs of Shannon's noiseless coding 
theorem) gave us the foundations of an information theory pertaining to quantum systems in terms 
of quantum bits, or qubits, and entanglement that is measured (in the bipartite case) in entangle- 
ment bits, or ebits. These discoveries generated huge excitement. By now quantum information 
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has become a well-established field, and there are many reviews and textbooks to which we refer 
the reader for background information. See for example [103J. 

In view of the advantages that quantum information offers for computation and cryptography, 
it is natural to enquire whether quantum information is also a superior medium for efficient com- 
munication. In this article we will review progress on this specific question, and its relation to the 
problem of quantum non-locality which has fascinated physicists for decades. 

On the face of it, there are important reasons for doubting that quantum information provides 
such a communication efficiency advantage. Many years before the "quantum information" dis- 
cipline took hold on a large scale, Holevo [75] proved an important theorem about the classical 
information capacity of quantum channels. Holevo's Theorem — as it is now called — states that, for 
any classical message, the cost of transmitting it from one party (Alice) to another party (Bob) 
in terms of quantum bits (qubits) is the same as the cost of transmitting it in terms of classical 
bits. If the task requires k bits on average, then it also requires k qubits on average. The latter 
consequence of Holevo's Theorem can be proven quite simply using a different approach [99], and 
this proof is reproduced in Appendix |A] Thus one would naively expect that quantum informa- 
tion cannot provide a communication efficiency advantage. This intuition turns out to be wrong. 
Tremendous communication savings are possible with the use of quantum information, as explained 
in the next section. 

1.2 Communication complexity 

To understand why quantum information can provide a communication advantage without contra- 
dicting Holevo's Theorem, it is necessary to consider more precisely the various scenarios that can 
be associated with "communication". 

The simplest scenario, corresponding to the case covered by Holevo's Theorem, is illustrated 
in Fig. [TJ There are two parties that we refer to as Alice and Bob. Alice has an n-bit string x 

Input: x G {0, l} n 
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I 

Output: x 

Figure 1: The basic communication scenario: Alice receives an n-bit string x as input and sends 
one message to Bob, who must output x. For this task, a quantum message is no more efficient 
than a classical message. 

that she would like to convey to Bob by sending one message. Here it is indeed true, by Holevo's 
Theorem |75| . that quantum messages are no more efficient than classical messages. Alice must 
send n qubits to accomplish this specific task. 

A variant of the communication scenario is where Bob's goal is not to determine Alice's data x, 
but to determine some information that is a function of x in a way that may depend on other 
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data y that resides with Bob (while y is unknown to Alice). Such a scenario could occur when Alice 
and Bob each begin with n-bit strings, x and y, respectively (Alice knows x but not y and Bob 
knows y but not x), and the goal is for Bob to determine the value of some function f(x, y) (where 
/ is known to both parties). An example where such a scenario could arise is where Alice and Bob 
are interested in scheduling an appointment. Alice's schedule could be represented by x and Bob's 
by y: if there are n time-slots, then we can set the ith bit of x to 1 if Alice is available in time-slot i, 
and similarly for y. How much communication is required for Bob to find a time when they are 
both available (i.e., an i such that Xi = yi = 1)? We shall see that, for this communication scenario, 
quantum information enables Alice and Bob to accomplish the task with less (asymptotically less 
in the number of time-slots) qubit communication than would be required by any protocol that is 
restricted to classical bit communication. 

This kind of scenario, illustrated in Fig. [2] (for general functions or relations / on {0, l} n x 
{0, l} n ) is known as communication complexity. It has been extensively studied in the classical 

Inputs: x e {0, l} n y € {0, l} n 



I 




I 

Output: f(x,y) 



Figure 2: The basic communication complexity scenario: Alice and Bob receive n-bit strings, x 
and y respectively, as input and their goal to compute some function of these values f(x,y), as 
Bob's output. There are tasks of this form where communication in terms of quantum messages 
is much more efficient than communication in terms of classical messages. The number of qubits 
can be exponentially smaller than the number of bits. Note that in this framework we do not take 
into account the time and other resources that Alice and Bob spend locally (although in practice 
it turns out that their local computations are almost always efficient). 

case. Indeed, whereas the trivial solution to this problem is for Alice to send Bob her input x, and 
for Bob to compute f(x,y), it is often possible for Bob to compute / with much less than n bits of 
classical communication. These savings in classical communication are very interesting both from 
a practical and a conceptual point of view. Section [3] outlines several of the key results in the area, 
and we refer the reader to the textbooks |83[ [77] for further information. 

When Alice and Bob can communicate qubits, further reductions in the amount of communi- 
cation are possible, sometimes even exponential reductions. This remarkable situation is clearly 
worthy of further study. It is one of the main subjects covered by the present review, and we will 
see many examples later. 

1.3 Quantum non-locality 

Long before the work on quantum communication complexity mentioned in the previous section, 
physicists investigating the foundations of quantum mechanics studied the scenario where local 
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measurements are carried out on two entangled particles. Such entangled states can (at least in 
principle) be easily produced by having the particles interact together for some time, and then 
sending the particles away to far-off locations. Local measurements are then carried out on the 
particles. This scenario was first studied by Einstein, Podolsky, and Rosen [60] and immediately 
afterwards by Schrodinger [118} 1119] (who coined the word entanglement) . In these works it was 
realized that the results of the local measurements would exhibit very interesting correlations. For 
instance, for some pairs of the measurements, the results may be always the same; for other pairs 
of measurements, the results may be always opposite, etc. 

Nevertheless, one can easily show — this follows immediately from the structure of quantum 
mechanics — that the parties carrying out the measurements cannot use the entangled particles to 
communicate to each other. More precisely, if two physically separated parties, Alice and Bob, 
initially possess entangled particles and then Alice is given an arbitrary bit x, there is no way 
for Alice to manipulate her particles in order to convey any information about x to Bob when he 
performs measurements on his particles. 

Given that these correlations cannot be used for communication, one would naively expect that 
if a (quantum or classical) model can reproduce these correlations, then it is not necessary for 
that model to use communication. This is indeed the case in the quantum scenario where, having 
established the entanglement through some interaction in the past, no communication is needed 
at the time of the measurement. But if one wants to reproduce these correlations in a purely 
classical model, then classical communication between the parties is required at the moment of the 
measurements! This situation is even more surprising if the particles are widely separated from 
each other and the measurements take place during a very short time interval, so short that the 
two measurement events are space-like separated. In this case the communication would have to 
occur faster than the speed of light! 

This remarkable feature of quantum mechanics was discovered by Bell [13], and is now known 
as "quantum non-locality". It has been the subject of much further theoretical and experimental 
study since. Indeed it is one of the most surprising and counter-intuitive features of quantum me- 
chanics. Bell's Theorem shows that Einstein's program of trying to rationalize quantum mechanics 
by reducing it to classical mechanics is futile and doomed to failure, as it cannot be done without 
giving up another cornerstone of twentieth century physics (discovered by Einstein himself), namely 
the fact that information cannot travel faster than the speed of light. More recently, another reason 
why such a reduction is doomed emerged through the study of quantum information. Namely we 
expect any such classical description of quantum mechanics to be exponentially inefficient, i.e., to 
use exponentially more resources than the quantum theory. We will discuss quantum non-locality 
extensively in the present review, focusing on its connection to communication complexity. 

1.4 Unity of quantum communication complexity and quantum non-locality 

The reason why in this review we deal with quantum communication complexity and quantum 
non-locality together is that these two topics are intimately related. Indeed they can be formulated 
in a unified way, and furthermore many questions can be mapped from one topic to the other. 
In fact, during the past dozen years an intense cross-fertilization has occurred between these two 
fields, which has considerably enriched both of them. 

To see the unity between the two subjects, recall that in both cases the parties, Alice and 
Bob, are given some inputs, x and y. In one case these inputs correspond to the arguments of the 
function that must be computed. In the other case these inputs correspond to a description of the 
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measurements that must be carried out on the particles (the "measurement settings"). And in both 
cases Alice and Bob must provide an output, a and b. In communication complexity we require 
that b = f(x,y) and a is irrelevant; in non-locality we are interested in the correlations between a, 
b and x, y (for instance we request that a = b when x and y have certain values and that a ^ b 
when x and y have some other values). We can unify these descriptions by saying that the aim in 
both cases is to produce a joint probability distribution 

P(a,b\x,y) 

of the outputs given the inputs, such that P(a,b\x,y) has certain desirable properties. 
1.5 Resources 

In both communication and non-locality, the basic question one wants to answer is: what is the 
minimum amount of resources necessary to reproduce the distribution P(a, b\x,y), and how does 
this amount change when one changes the model, i.e., when one changes the type of resource that 
can be used. There are in fact many different types of resources that can be compared, and we now 
briefly review them. We will come back to them in more detail in the body of the review. 

• Quantum communication. The parties are allowed to send each other quantum states. One 
quantifies the amount of communication by the number of qubits sent. 

• Classical communication. The parties are allowed to send each other classical communication. 
One quantifies the amount of communication by the number of bits sent. 

• Entanglement. The parties share entangled states. One quantifies the amount of entanglement 
by the number of qubits that the state locally consists of. For example we frequently use 
maximally entangled states of 2 qubits, called ebits (also known as EPR pairs after [60]), 
^=(|0)|0) + |1)|1)) or something that can be obtained from this with local operations. 

• Shared randomness. The parties have randomness, i.e., they are allowed to toss coins. In 
the case of shared randomness, the parties both share the same string of coins. This could 
for instance be implemented by having the parties toss the coins beforehand, at some earlier 
time when they are together, and then use the coins later when they need to solve the 
communication complexity problem. 

• Local randomness. The parties have randomness, i.e., they are allowed to toss coins. In the 
case of local randomness the coins are tossed locally, and the string of outcomes of the coins 
for Alice is independent of the string of outcomes of the coins for Bob. 

The rational for measuring classical information in terms of bits is Shannon's noiseless coding 
theorem |121] . which states that, asymptotically, the information produced by a stochastic source 
can be encoded in a number of bits equal to the entropy of the source. This is paralleled in the 
quantum case by Schumacher compression |120j . which states that, asymptotically, the information 
produced by a stochastic quantum source can be encoded into a number of qubits equal to the von 
Neumann entropy of the source. And it is paralleled in the case of entanglement, by entanglement 
distillations, namely the fact that pure two-party entangled states can, asymptotically in the number 
of copies of the state, be converted into the number of ebits equal to the von Neumann entropy 
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of the reduced density matrix of each party [16]. In the context of communication complexity, 
however, we are not dealing with the asymptotic limit of large amounts of communication or large 
amounts of entanglement. Thus whereas in most cases we will keep the basic concepts of bits, 
qubits and ebits, it could be relevant in specific cases to consider variants on these resources, such 
as trits, non-maximally entangled states, etc. 

The above resources have been ordered (more or less) from the strongest to the weakest. Indeed 
most of these resources imply the ones below them. For instance one can send classical information 
using qubits; one can use quantum communication to distribute entanglement; one can measure 
the entangled particles to produce shared randomness, etc. The only case where the ordering is 
not so clear is between classical communication and entanglement. Indeed if two parties share an 
entangled state, they cannot use it to communicate (as discussed above). But on the other hand 
(as discussed below) sharing n ebits may allow one to save an exponentially large (in n) amount 
of bits in some communication scenarios (whereas in all other cases, n uses of one resource allows 
one to implement n uses of the resources below it). 

There are also a number of nontrivial ways in which these resources can be substituted one 
for the other. Quantum teleportation allows one to substitute one ebit and two bits of classical 
communication for one qubit of quantum communication [14]. Dense coding shows that sharing 
one ebit and then communicating one qubit allows one to communicate two bits [15]. Newman's 
Theorem states that in the context of communication complexity, having shared randomness can 
save only a small amount of communication compared to having local randomness [101] . 

In addition we will at some points in this review consider other additional (more specialized or 
more exotic) resources. For instance one can consider 

• One-way classical or quantum communication. Alice is allowed to communicate to Bob, but 
Bob is not allowed to communicate back to Alice. 

• Simultaneous Message Passing model. In this model there is a third party, called the Referee, 
and messages are only allowed from Alice to the Referee and from Bob to the Referee. It is 
the Referee who has to compute the value of the function f(x,y). 

• Multipartite entanglement. Sometimes one is interested in non-locality or communication 
complexity between more than two parties. Contrary to bipartite entanglement where it is 
sufficient to consider ebits, there are many kinds of multiparticle entanglement (such as GHZ 
states, W states, etc.) which could be useful for solving different communication problems. 

• Non-local (or PR) boxes. This exotic resource is intermediate between an ebit and a bit. 
Indeed, it is a resource which does not enable the parties to communicate (in the same way 
that entanglement does not allow communication). But to be produced physically it requires 
a bit of communication between the parties at the moment it is used (contrary to entangle- 
ment which once established requires no more communication). Its study provides a deeper 
understanding of the power and limitations of quantum entanglement in communication com- 
plexity. 

1.6 Basic scenarios 

The basic question asked in communication complexity and quantum non-locality is to understand 
how much of these resources are required in different situations. 
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Thus classical communication complexity |83| is basically concerned with understanding how 
much classical communication is required to compute the value of a function f(x, y), possibly using 
(shared or local) randomness. 

In quantum communication complexity the parties are trying to compute the value of /, but may 
now use quantum resources. In the quantum communication model, introduced by Yao [137] . they 
can communicate qubits, and in the entanglement model, introduced by Cleve and Buhrman |45j, 
the parties share entangled particles and are allowed to communicate classical bits. When one 
extends the quantum communication model of Yao such that the parties also share entangled 
particles, quantum teleportation shows that these two models are essentially equivalent: one qubit 
in the first model can be replaced by two bits and one ebit in the entanglement, and conversely one 
bit can be simulated by one qubit. It is, however, a challenging open problem whether the quantum 
communication model, without shared entanglement, is essentially equivalent to the entanglement 
model. 

Non-locality, although at first sight a very different topic, is also concerned with comparing 
resources. Indeed the basic question in this area is to compare: 

• The correlations that can be obtained if the parties share entanglement and carry out local 
measurements on their particles, but are not allowed any communication. 

• The correlations that can be obtained if the parties have shared randomness, but are not 
allowed any communication. This is known in the physics literature as a local hidden variable 
model. 

Bell's Theorem states that these two scenarios are not equivalent: shared randomness alone is 
not sufficient to reproduce the quantum correlations. 

1.7 Mappings between communication complexity and non-locality 

Thus quantum communication complexity, classical communication complexity, and non-locality 
can be put in a unified framework in which similar kinds of resources are compared. In addition, 
in some cases there exist mappings between quantum communication complexity scenarios and 
non- locality scenarios. 

The most simple such mapping occurs in the entanglement model if the parties can solve the 
communication complexity problem more efficiently using entanglement than without entangle- 
ment, and if this can be done by measuring their entangled particles before they communicate to 
each other. Then it immediately follows that the correlations obtained by measuring their entangled 
particles (but without communicating), cannot be realized in a local hidden variable model. 

Conversely it is possible to map any non-locality experiment to a communication complexity 
problem in the entanglement model. This was the approach used in the original paper [35j. It 
mapped the non-local correlations that arise in the GHZ paradox to a communication complexity 
problem. This approach has since been generalized [26], although in the resulting communication 
complexity problem the function f(x,y) is only computed successfully by the parties with non-zero 
probability. 

Another mapping can occur in the quantum communication model when one-way quantum 
communication from Alice to Bob is more efficient than classical communication. Then it is often 
possible to construct from the communication complexity problem a nontrivial non-locality scenario. 
This approach has yielded some very interesting non-locality scenarios which we will describe in 
detail below. 
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1.8 Summary of the review 

In this review we will present some of the main results obtained so far in the field of quantum 
communication complexity. We start by introducing quantum non- locality in Section [2 focusing on 
its relation with communication complexity. We present simple examples such as the GHZ paradox, 
the CHSH example, the magic square game, but rephrasing them in the language of data processing. 
Next we present quantum communication complexity in Section [3l illustrating it with examples such 
as the distributed Deutsch-Jozsa problem, the intersection problem, Raz's problem, and the hidden 
matching problem. In Section [4] we unite these two approaches, showing how some of the examples 
from quantum communication complexity can be used to derive new non-locality games. In section[5] 
we discuss another model of communication complexity, the simultaneous message passing model, 
and show how classical communication, entanglement, quantum communication can be traded one 
for the other in this model. In Section [6] we discuss several additional aspects of quantum non- 
locality, such as non-local boxes, Tsirelson bounds, and simulation of quantum correlations using 
classical resources. Finally we consider in Section [7] experimental issues, in particular the detection 
loophole, and present the outlook for future experiments. We conclude by discussing some open 
questions in the field. The interested reader can also consult the earlier review [20] which covers 
some of the material presented here. 

2 Simple Non-locality Examples 

The idea of non-locality was originally concerned with the possibility that quantum mechanics is 
actually a classical theory that depends on "hidden variables" whose values might be discovered in 
the future as part of some successor theory to quantum mechanics. Bell [13] proposed a hypothetical 
experiment for ruling out such classical theories under the assumption that measurements of quan- 
tum systems can occur at different points in space-time, and information cannot be transmitted 
faster than the speed of light. 

Another way of interpreting Bell's experiment is as a method for two (or more) cooperating 
distributed parties to compute some sort of input-output relation, where each party receives input 
data and must produce output data consistent with the relation. In Bell's experiment, there is such 
a task that cannot be accomplished in a setting where the information processing resources are all 
classical. In contrast, the task can be accomplished if the parties share prior entanglement. 

Since Bell's seminal work, the concept of quantum non-locality has been extensively studied, 
by physicists, philosophers, and more recently by computer scientists. Some of the important early 
advances have been the Clauser-Horn-Shimony-Holt (CHSH) inequality [44] which allows Bell's 
surprising predictions to be tested even in the presence of noise; and the GHZ-Mermin scenario [72} 
[94] which was the first "pseudo-telepathy" game. More recently there has been a more or less 
systematic enumeration of Bell inequalities for small number of settings and/or outcomes (see, e.g., 
[4^1 14"51 11341 1141] ); the study of the statistical power of non- locality tests [52J; an understanding of 
the limits to quantum non-locality (Tsirelson-type bounds) [43] as compared to the larger world of 
correlations obeying only the no-signalling conditions (e.g., non-local boxes); investigations of the 
power of non-locality in cryptographic settings [llj . etc. 

In the next paragraphs we review various non-locality scenarios, casting them in the language of 
data processing. The reader wishing to complement this overview could consult two recent reviews, 
written more from physics [135] and computer science [21] perspectives. 
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2.1 GHZ: Greenberger-Horne-Zeilinger and Mermin 



The following scenario essentially underlies those of [721 US] > but is cast in the language of data 
processing. The basic structure is illustrated in Fig. [3l Three physically separated parties — call 



Inputs: 





Outputs: 



Figure 3: The general form of a non-locality scenario involving three parties: Alice, Bob, and 
Carol receive inputs s, t, u respectively and are required to produce outputs a, b, c, respectively, 
satisfying certain conditions. Once the inputs are received, no communication is permitted between 
the parties. For the specific GHZ scenario, it is possible to accomplish the task if the parties are 
in possession of a tripartite entangled state. Without the prior entanglement, it is impossible to 
accomplish the task. 

them Alice, Bob, and Carol — receive input bits s, t, and u, respectively, which are arbitrary subject 
to the condition that s © t © u = (ffi denotes exclusive or, which is the sum of its arguments in 
modulo 2 arithmetic). Once they receive their input data, they are forbidden from having any 
communication between them. Their goal is to produce output bits a, b, and c, respectively, such 
that 

fo ifstu = 000 

a@b®c={ r , (1 

[l if stu e {011,101,110}. 

Note that the task that the three parties are trying to accomplish is the computation of a relation, 
where there are three input bits (stu) and three output bits (abc). The task is nontrivial in light of 
the fact that the input bits are distributed among the parties so that each party is given the value 
of only one of them; the output bits are also distributed. 

The first observation is that with classical resources there must be communication among the 
three parties to succeed. To see why this is so, first consider deterministic strategies (later we 
will analyze the case of probabilistic strategies, where the parties behave stochastically, i.e., they 
can flip coins). Since Alice cannot receive any information from Bob or Carol, her output bit a 
can depend only on the value of her input bit s. Let ao (respectively ai) be Alice's output when 
her input bit is (respectively 1). Similarly let bo,bi and co,ci be Bob and Carol's outputs for 
their respective input values. Note that the six bits ao, a±, bo, b\, cq, c\ completely characterize any 
deterministic strategy of Alice, Bob, and Carol. The conditions of the problem translate into the 
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equations 



a © 6 © c = 0, 

a bi ci = 1, 

01 © 60 © ci = 1, 

at 61 © c = 1. (2) 

It is impossible to satisfy all four equations simultaneously. This is because summing the four 
equations modulo two, yields = 1 (recall that 1 + 1 = modulo 2). Therefore, for any strategy, 
there exists an input configuration stu € {000,011, 101, 110} for which it fails. Note however that 
for any three out of the four equations from (|2|) there is a strategy that satisfies these three equations 
perfectly. 

To see why probabilistic strategies cannot succeed either, note that any such strategy can be 
modeled as a deterministic strategy where Alice, Bob, and Carol have access to a random variable 
r (for example, r could be the outcomes of a sequence of uniformly distributed random bits). This 
r is sometimes referred to as a "local hidden variable". It is assumed that the testing procedure 
does not have access to r, so that the input bits (stu) are uncorrelated with r. The intuitive 
way of thinking about this scenario is that the three parties get together before the game starts, 
randomly select r, and then each party secretly keeps a copy of this information. An example of a 
probabilistic strategy is for r € {0, l} 2 to be two uniformly random bits that specify which three 
of the four equations in (j2J) are satisfied. This probabilistic strategy succeeds with probability 3/4. 
We next show that this success probability is optimal. 

Suppose that the input data s,t, u is uniformly distributed over {000,011, 101, 110}. Then the 
success probability that any randomized protocol achieves is 

^2qr^^2P(s,t,u,r), (3) 

r s,t,u 

where q r is the probability (of the shared randomness) that the parties flip r, and P(s, t, u,r) = 1 if 
the deterministic protocol corresponding to r is correct on input stu and P(s, t, u,r) = otherwise. 
Clearly this is bounded above by 

max- VP(s,t,u,r), (4) 

r 4 * — ' 

s,t,u 

which by the above discussion is at most 3/4. 

Now consider the same problem, but where Alice, Bob, and Carol have an additional resource: 
each is supplied with a qubit, where the state of the combined 3-qubit system ia3 

||000) - i|011) - ||101) - 5IHO). (5) 

The parties are allowed to apply unitary transformations and perform measurements on their 
individual qubits, but communication between the parties is still forbidden. It turns out that now 
the parties can produce a, b, c satisfying Eq. (pQ). This is achieved by the procedure that follows. 

The procedure for Alice is to measure her qubit in the computational basis (consisting of |0) 
and |1)) if her input bit s is 0, and to measure her qubit in the Hadamard basis (consisting of 



This is an entangled state that is equivalent to the so-called GHZ state -i=|000} + i|lll) (under local unitary 
operations) . 
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H\0) = -^(|0) + |1)) and H\l) = 4g(|0) - |1))) if her input bit is 1. In either case, she sets her 
output bit a to the outcome of her measurement. The procedures for Bob and Carol are similar to 
that of Alice, but with Bob's bits being s and b, and Carol's bits being u and c. 

To see why the described procedure always produces output bits abc satisfying Eq. (JTJ, consider 
the various cases of the input possibilities stu. In the case where stu = 000, the state is measured 
in the computational basis, so clearly the outcomes are from {000, 011, 101, 110}, and hence satisfy 
a© b(B c = 0. The case where stu = 011 can be analyzed by assuming that a Hadamard transform 
is applied to the last two qubits of the state prior to a measurement in the computational basis. 
Since 

(I®H®H) (||000> - ±|011) - i|101) - i|110>) 
= (I®H®H) (||0}(|00) - |11)) - §|1)(|01) + |10))) 
= i| )(|01) + |10))-i|l)(|00)-|ll)) 

= i|001} + ||010) -i|100) + i|lll), (6) 

a © 6 © c = 1, as required, in this case. The remaining cases where stu = 101 and 110 are similar 
by the symmetry of the entangled state and protocol. 

We have shown that the entangled state enables the three parties to correlate their output bits 
with their inputs bits in a manner that is impossible to achieve with classical resources, unless there 
is communication among the parties. It should be noted that, in accomplishing this task using the 
entangled state, no actual communication occurs among the parties. In particular, the output bits 
a, b, and c individually contain no information about stu; they are uniformly distributed in all 
cases. It is only the trivariate correlations among a, b, and c that are related to the input data stu. 



2.2 CHSH: Clauser-Horne-Shimony-Holt 

The following scenario essentially underlies that of |H] but is cast in the language of data processing. 
The basic structure is illustrated in Fig. HI Alice and Bob receive input bits s and t, respectively, 




Figure 4: The non-locality scenario involving two parties: Alice and Bob receive inputs s and t 
respectively, and are required to produce outputs a and b respectively, satisfying certain conditions. 
Once the inputs are received, no communication is permitted between the parties. For the specific 
CHSH scenario, it is possible to accomplish the task with probability cos 2 (7r/8) = 0.853. . . if the 
parties are in possession of an ebit. Without the prior entanglement, the highest possible success 
probability is 3/4. 



13 



and, after this, they are forbidden from communicating with each other. Their goal is to produce 
output bits a and b, respectively, such that 



a®b = sAt, (7) 

('A' is the logical and, which is 1 if all its arguments are 1, and which is otherwise) or, failing that, 
to satisfy this condition with as high a probability as possible. To analyze the situation in terms 
of classical information, first again consider the case of deterministic strategies. For these, Alice's 
output bit depends solely on her input bit s and similarly for Bob. Let ao, a\ be the two possibilities 
for Alice and bo,b\ be the two possibilities for Bob. These four bits completely characterize any 
deterministic strategy. Condition ([7]) translates into the equations 

ao © b = 0, 

a © h = 0, 

ai © b = 0, 

ai © &i = 1. (8) 

It is impossible to satisfy all four equations simultaneously (since summing them modulo 2 yields 
= 1). Therefore it is impossible to satisfy Condition (|7|) absolutely. By using a probabilistic 
strategy, Alice and Bob can satisfy Condition (J7J with probability 3/4. For such a strategy, we 
allow Alice and Bob to have a priori classical random variables, whose distribution is independent of 
that of the inputs s and t. Note that any three of the four equations of ([8]) can be simultaneously 
satisfied. The probabilistic classical strategy works as follows. Alice and Bob have uniformly- 
distributed random bits that are used to specify which of the four equations of ([8]) is violated, and 
then play the strategy that satisfies the other three perfectly. It is easy to see that (a) for any input 
st, the resulting outputs satisfy Condition (|7|) with probability 3/4, and (b) this is optimal in that 
no probabilistic strategy can attain a success probability greater than 3/4. 

Now consider the same problem but where Alice and Bob are each supplied with a qubit where 
the state of the two-qubit system is initialized to 

^(100) - |H)). (9) 

It turns out that now the parties can produce data that satisfies Condition ([7]) with probability 
cos 2 (-7r/8) = 0.853. . ., which is higher than what is possible in the classical case. This is achieved 
by the following procedures. Denote the unitary operation that rotates the qubit by angle 9 by 

R(0) = ( C ° S ? sm ^ \ ^jjgj-g we j^ave written it out in the computational basis). Alice applies 
V sine cos 9 J 

one of two rotations on her qubit, depending on her input bit s: if s = the rotation is R{— 7r/16); 
if s = 1 the rotation is i?(37r/16). Then Alice measures her qubit in the computational basis and 
sets her output bit a to the result. Bob's procedure is the same, depending on his input bit t. It is 
straightforward to calculate that, if Alice rotates by 6\ and Bob rotates by 92, then the entangled 
state becomes 

^(cos(0! + 2 )(|OO> - |11)) + sin(0i + 2 )(|O1> + |10»). (10) 

After the measurements, the probability that a © b = is cos 2 (#i + 92). It is now a straightfor- 
ward exercise to verify that Condition [7J is satisfied with probability cos 2 (ir /8) for all four input 
possibilities. 
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2.3 Tsirelson's upper bound for CHSH 

Although the protocol in the previous subsection using entanglement has a higher success prob- 
ability (cos 2 (7r/8) = 0.853...) than any classical protocol (3/4), it still does not succeed with 
probability 1. This raises the question of whether there is a different strategy using entanglement 
that always succeeds — or, failing that, whose success probability exceeds cos 2 (7r/8). Tsirelson [33] 
first showed that the above quantum protocol is optimal in that it is impossible to exceed success 
probability cos 2 (tt /8) , regardless of the strategy — including any amount of prior entanglement — the 
parties start with. What follows is a simple proof of this result. 

Consider an arbitrary bipartite entangled state \iP)ab- An arbitrary strategy for Alice that 
uses this entangled state can be represented by two observable^! Aq and A x , each with eigenvalues 
in {+1,-1}. When Alice's input bit is 0, she obtains her output bit by applying the projective 
measurement corresponding to the eigenspaces of Aq to the component of \iP)ab in her possession. 
The +l-eigenspace of Aq corresponds to output bit 0, while the — 1-eigenspace corresponds to the 
output bit 1. When her input bit is 1, she applies the measurement corresponding to A\. Similarly, 
an arbitrary strategy for Bob can be represented by two observables Bq and B x . 

At this point, the reader might object that \iP)ab, Aq, A x , Bq, and B x do not capture every 
possible strategy of Alice and Bob, since they need not be limited to applying projective measure- 
ments. Although non-projective measurements may be used, such measurements can always be 
simulated by projective measurements in a larger Hilbert space. Thus, no generality has been lost 
because any strategy can be converted to the above form. 

Since the observables have eigenvalues in {+1, —1} rather than {0, 1}, it is more convenient here 

to think of Alice and Bob's output bits in these terms as a' = (— l) a and b' = (— l) b , respectively. 

Then the protocol succeeds on input st if and only if (— l) sA * ■ a' ■ b' = 1. 

If s and t are randomly chosen according the uniform distribution, then the expected value of 
(_1)-At . a > . y is 

{iP\ab (|A) ® B + \Aq ®B l + \A X ®B - \A X ® B x ) |^) AB , (11) 

and is therefore upper bounded by the largest eigenvalue of 

M = \A ® B + \Aq ®B 1 + \A X ® B - \A X ® B x . (12) 

It is straightforward to calculate that 

M 2 = ll-^(A o A 1 )®(B o B 1 ) + ^(AoA 1 )®(B 1 B o ) + ^(A 1 A o )<g ) (B o B 1 )-^(A 1 Ao)0(B 1 B o ), (13) 

from which we can upper bound the maximum eigenvalue of M 2 by the sum of the maximum 
eigenvalue in each term, obtaining | + ^ + ^+ ]^ + ]^ = ^- It follows that the largest eigenvalue 
of M itself is at most l/y/2, which therefore upper bounds the expected value of (— l) sA< • a' ■ b' . 
This translates into an upper bound of (1 + l/V2)/2 = cos 2 (vr/8) for the success probability of the 
actual protocol (where Alice and Bob output bits a and b). This completes the proof of Tsirelson's 
upper bound for CHSH. 

2 An observable is a Hermitian operator. One associates to an observable a projective measurement, with one 
projector for each of the eigenspaces of the observable. 
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2.4 Magic square game 

In one respect the GHZ example is more striking than the CHSH example: in the former case, the 
protocol with entanglement always succeeds, while in the latter case the protocol with entanglement 
merely succeeds with higher probability. However, the GHZ example involves three parties, whereas 
the CHSH example only involves two. Is there a two-party scenario where the quantum protocol 
always succeeds, whereas the best classical success probability is bounded below 1? The answer 
is affirmative, see for instance \38\ [3?1 139] . A particularly elegant example is the following game, 
which has been referred to as the magic square game [5j. 

To define this game, consider the problem of labeling the entries of a 3 x 3 matrix with bits so 
that the parity of each row is even, whereas the parity of each column is odd. It is not hard to see 
that this is impossibl^l. The two matrices 
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each satisfy five out of the six constraints. For the first matrix, all rows have even parity, but only 
the first two columns have odd parity. For the second matrix, the first two rows have even parity, 
and all columns have odd parity. 

Bearing the above in mind, consider the game where Alice receives s E {1, 2, 3} as input (spec- 
ifying the number of a row), and Bob receives t £ {1,2,3} as input (specifying the number of a 
column). Their goal is to each produce 3-bit outputs, 010203 for Alice and &1&2O3 for Bob, with 
these properties: 

1. They satisfy the row/column parity constraints. Namely, ai ffia2©03 = and 61 ©62 ©^3 = 1- 

2. They are consistent where the row intersects the column. Namely, at = b s . 

As usual, Alice and Bob are forbidden from communicating once the game starts, so Alice does 
not know what t is and Bob does not know what s is. We shall observe that, classically, the best 
success probability possible is 8/9, whereas there is a quantum strategy that always succeeds. 

An example of a strategy that attains success probability 8/9 (when the input st is uniformly 
distributed) is where Alice plays according to the rows of the first matrix above and Bob plays 
according the columns of the second matrix above. This succeeds in all cases, except where s = t = 
3. To see why this is optimal, note that for any other classical strategy, it is possible to represent 
it as two matrices as above but with different entries. Alice plays according to the rows of the first 
matrix and Bob plays according to the columns of the second matrix. We can assume that the rows 
of Alice's matrix all have even parity; if she outputs a row with odd parity then they immediately 
lose, regardless of Bob's output. Similarly, we can assume that all columns of Bob's matrix have 
odd parityE] Considering such a pair matrices, the players lose at each entry where they differ. 
There must be such an entry, since otherwise it would be possible to have all rows even and all 
columns odd with one matrix. Thus, when the input st is chosen uniformly from {1, 2, 3} x {1, 2, 3}, 
the success probability is at most 8/9. 



3 As before, we can express a valid solution in terms of equations, in this case six of them (where arithmetic is 
modulo 2): mn +77112 +TO13 = 0, m 2 i+m 2 2+m23 = 0, 77731+777,32+777,33 = 0, TO11+TO21+TO31 = 1, mi2+m22+m32 = 1, 
"J 13 + 77723 + 77133 = 1. Adding these equations modulo 2 yields = 1. 

4 In fact, the game can be simplified so that Alice and Bob each output just two bits, since the parity constraint 
determines the third bit. 
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The quantum strategy for this game is based on the following observation due to Mermin |93|, [95] . 
Let I, X, Y, Z denote the 2x2 Pauli matrices: 

'=(;;)■*=(! ;)- y =(^)"-*=(i-!)- ^ 

Each is an observable with eigenvalues in {+1, —1}- Consider the following table of two-qubit 
observables that are each a tensor product of two Pauli matrices: 



X® X 


Y ®Z 


Z®Y 




z®x 


X®Z 


z®z 


X®Y 


Y ®X 



For our present purposes, the noteworthy property is that the observables along each row commute 
and their product is I (g) /, and the observables along each column commute and their product 
is —I ® I. This implies that, for any two-qubit state, performing the three measurements along 
any row results in three {+1, — l}-valued bits whose product is +1. Also, performing the three 
measurements along any column results in three {+1, —1}- valued bits whose product is —1. This 
can be seen more easily when one simultaneously diagonalizes the three commuting observables. 
They will have 1 and — 1 eigenvalues on the diagonal. Each consecutive observable will project the 
state onto a possible refinement of the current eigenspace the state lies in. This will yield that the 
product of the outcomes of the three observables will be 1 in case the observables belong to a row 
of the matrix, because the product of the row observables is I <8> I, and —1 when they belong to a 
column, since the product of the observables for each column is —I <g> I. 

We can now describe the quantum protocol. It uses two pairs of entangled qubits, each of 
which is in initial state t^(|01) — 1 10)). Alice, on input s, applies three two-qubit measurements 
corresponding to the observables in row s of the above table. For each measurement, if the result 
is +1, she outputs and if the result is —1, she outputs 1. Similarly, Bob, on input t, applies the 
measurements corresponding to the observables in column t, and converts the outcomes into bits 
in the same manner. 

We have already established that Alice and Bob's output bits satisfy the required parity con- 
straints. It remains to show that Alice and Bob's output bits that correspond to where the row 
meets the column are the same. For that measurement, Alice and Bob are measuring with respect 
to the same observable in the above table. Because all the observables in each row and in each 
column commute, we may assume that the place where they intersect is the first observable applied. 
Those bits are obtained by Alice and Bob each measuring ^(|01) — |10))(|01) — 1 10) ) with respect to 
the observable in entry (s, t) of the table. To show that their measurements will agree for all cases 
of st, we consider the individual Pauli measurements on the individual entangled pairs of the form 

(| 01) — 1 10)). Let a' and b' denote the outcomes of the first measurement (in terms of bits), and 
a" and b" denote the outcomes of the second. Since the measurement associated with the tensor 
product of two observables is operationally equivalent to measuring each individual observable and 
taking the product of the results, we have that at = a' © a" and b s = b' © b" . It is straightforward 
to verify that if the same measurement from {X,Y, Z} is applied to each qubit of "^(|01) — 1 10) ) 
then the outcomes will be distinct. Therefore, a' © b' = 1 and a" © b" = 1, from which it follows 
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that 



a t © b s = (a' © a") © (b' © b") 
= (a' © b') © (a" © 6") 
= lffil 

= 0, (15) 
so = 6 S . This completes the analysis of the magic square game. 

3 Communication Complexity 

In the last section we considered scenarios without communication. Here we will extend the non- 
locality setting to one where the parties (Alice and Bob) are allowed to send information to each 
other in the form of bits or qubits. They can still have shared randomness and may share an 
entangled quantum state. We are now interested in the minimum number of bits or qubits that 
are needed in order to compute a function that depends on the inputs of all the parties. 

The ability to send information to each other departs from the setting of non-locality. We will 
see that entanglement can be used to reduce (for certain functions) the communication drastically 
compared to when the parties share just classical resources. Accordingly, while entanglement cannot 
be used for signalling, it can be used to significantly reduce the communication needed for certain 
tasks. In later sections we will see how some of the ideas and protocols developed in the setting of 
communication complexity can be used to formulate new non-locality games. 

Communication complexity has been studied extensively in the area of theoretical computer 
science and has deep connections with seemingly unrelated areas, such as VLSI design, circuit 
lower bounds, lower bounds on branching programs, size of data structures, and bounds on the 
length of logical proof systems, to name just a few. We refer to the textbooks (53J EZ| f° r more 
details. 

3.1 The setting 

First we sketch the setting for classical communication complexity. Alice and Bob want to compute 
some function / : T> — > {0, 1}, where T> C X x Y . If the domain T> equals X x Y then / is called 
a total function, otherwise it is a promise function. Alice receives input x £ X, Bob receives input 
y £ Y, with (x,y) £ V. A typical situation, illustrated in Fig. [21 is where X = Y = {0, l} n , so 
both Alice and Bob receive an n-bit input string. As the value f(x,y) will generally depend on 
both x and y, some communication between Alice and Bob is required in order for them to be able 
to compute f(x,y). We are interested in the minimal amount of communication they need. 

A communication protocol is a distributed algorithm where first Alice does some individual 
computation, and then sends a message (of one or more bits) to Bob, then Bob does some compu- 
tation and sends a message to Alice, etc. Each message is called a round. After one or more rounds 
the protocol terminates and outputs some value, which must be known to both players. The cost 
of a protocol is the total number of bits communicated on the worst-case input. A deterministic 
protocol for / always has to output the right value f(x,y) for all (x,y) £ T>. In a bounded-error 
protocol, Alice and Bob may flip coins and the protocol has to output the right value f(x, y) with 
probability > 2/3 for all (x,y) £ V. We could either allow Alice and Bob to toss coins individually 
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(local randomness, or "private coin") or jointly (shared randomness, or "public coin"). The later is 
analogous to the local hidden variables in non-locality games. A public coin can simulate a private 
coin and is potentially more powerful. However, Newman's theorem |101j says that having a public 
coin can save at most O(logn) bits of communication, compared to a protocol with a private coin. 
Some often studied functions are: 

• Equality: EQ(x, y) = 1 if x = y, and EQ(x,y) = otherwise 

• Inner product: IP(x, y) = Ya=i x iVi ( m °d 2) (for x,y £ {0, l} n , Xi is the ith bit of x) 

• Intersection: INT(x,y) = 1 if there is an i where Xi = yi = 1, and INT(x,y) = otherwise 
(viewing x as corresponding to the set {i : Xi = 1} and similarly for y, INT(x, y) says whether 
the sets x and y intersect). A variant of this problem asks to actually find an i where 
x i = Hi = 1) or to output that none such i exists. 

Let us first consider the equality problem, which will recur throughout the text. The goal for Alice 
is to determine whether her n-bit input is the same as Bob's or not. It is not hard to show that 
in the deterministic case, n bits of communication are needed (see Section IB.ll of the appendix for 
a proof), so Bob might as well send his string to Alice after which Alice announces the answer to 
Bob with one more bit. 

To illustrate the power of randomness, let us give a simple yet efficient bounded-error protocol 
for the equality problem. Alice and Bob jointly toss a random string r £ {0, 1}™. Alice sends the 
bit a = x ■ r to Bob (where '-'is inner product mod 2). Bob computes b = y ■ r and compares 
this with a. If x = y then a = b, but if x ^ y then a ^ b with probability 1/2. Repeating this a 
few times, Alice and Bob can decide equality with small error using O(n) public coin flips and a 
constant amount of communication. 

This protocol uses public coins, but note that Newman's theorem implies that there exists 
an 0(logn)-bit protocol that uses a private coin. Let us explicitly describe such a protocol. Alice 
views her n bits as the coefficients of a polynomial p x over some finite field F of about 3n elementsll 
Px(t) = Ya=i x it ■ She picks a random element a G F, and sends Bob the pair a,p x (a), which 
she can do using 21og(3n) bits. Bob computes p y (a) and outputs 1 if p x (a) = p y (a), and outputs 
otherwise. Clearly, if x = y then Bob always outputs the correct answer 1. However, if x ^ y 
then the polynomial p x (t) — p y (t) is a polynomial in t of degree at most n — 1 that is not identically 
equal to 0. Such a polynomial can be on at most n — 1 elements of F. Hence with probability at 
least 2/3, the field element a that Alice chose satisfies p x (a) ^ Py{a), and Bob will give the correct 
output also in this case. 

3.2 The quantum question 

Now what happens if we give Alice and Bob a quantum computer and allow them to send each 
other qubits and/or to make use of ebits that they share at the start of the protocol? 

Formally speaking, we can model a quantum protocol as follows. The total state consists 
of 3 parts: Alice's private space, the channel, and Bob's private space. The starting state is 
|x)|0)|y): Alice gets x, the channel is initialized to 0, and Bob gets y. Now Alice applies a unitary 
transformation to her space and the channel. This corresponds to her private computation as well 

5 For those not familiar with finite fields: it suffices to choose a prime number p « 3n and do all additions and 
multiplications modulo this p. 
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as to putting a message on the channel (the length of this message is the number of channel-qubits 
affected by Alice's operation). Then Bob applies a unitary transformation to his space and the 
channel, etc. At the end of the protocol Alice or Bob makes a measurement to determine the 
output of the protocol. This model was introduced by Yao [137] . 

In the second model, introduced by Cleve and Buhrman [45] . Alice and Bob share an unlimited 
number of ebits at the start of the protocol, but now they communicate via a classical channel: the 
channel has to be in a classical state throughout the protocol. We only count the communication, 
not the number of ebits used. Protocols of this kind can simulate protocols of the first kind with 
only a factor 2 overhead: using teleportation, the parties can send each other a qubit using an ebit 
and two classical bits of communication. Hence the qubit-protocols that we describe below also 
immediately yield protocols that work with entanglement and a classical channel. Note that an 
ebit can simulate a public coin toss: if Alice and Bob each measure their half of the pair of qubits, 
they get the same random bit. 

The third variant combines the strengths of the other two: here Alice and Bob start out with an 
unlimited number of ebits and they are allowed to communicate qubits. This third kind of commu- 
nication complexity is in fact equivalent to the second, up to a factor of 2, again by teleportation. 

Before continuing to study this model, we first have to face an important question, already 
mentioned in the introduction: is there anything to be gained here? At first sight, the following 
argument seems to rule out any significant gain. Suppose that in the classical world k bits have to 
be communicated in order to compute /. Since Holevo's theorem says that k qubits cannot contain 
more information than k classical bits, it seems that the quantum communication complexity should 
be roughly k qubits as well (maybe k/2 to account for superdense coding, but not less). Surpris- 
ingly (and fortunately for us), this argument is false, and quantum communication can sometimes 
be much less than classical communication complexity. The information-theoretic argument via 
Holevo's theorem fails, because Alice and Bob do not need to communicate the information in the 
k bits of the classical protocol; they are only interested in the value f(x,y), which is just 1 bit. 
Below we will survey some of the main examples that have so far been found of differences between 
quantum and classical communication complexity. 

3.3 The first examples 

Quantum communication complexity was introduced by Yao [137] and studied by Kremer [82J, but 
neither showed any advantages of quantum over classical communication. Cleve and Buhrman [45] 
introduced the variant with classical communication and prior entanglement, and exhibited the 
first quantum protocol provably better than any classical protocol. It uses quantum entanglement 
to save 1 bit of classical communication. This gap was extended by Buhrman, Cleve, and van 
Dam |31| and, for arbitrary k parties, by Buhrman, van Dam, H0yer, and Tapp [34] . 

3.4 Distributed Deutsch-Jozsa 

The first impressively large gaps between quantum and classical communication complexity were 
exhibited by Buhrman, Cleve, and Wigderson [33]. Their protocols are distributed versions of 
known quantum query algorithms, like the Deutsch-Jozsa [56] and Grover [71] algorithms. 

Let us start with the first one. It is actually explained most easily in a direct way, without 
reference to the Deutsch-Jozsa algorithm (though that is where the idea came from). The problem 
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deals with a promise version of the equality problem. Suppose the n-bit inputs x and y are restricted 
to the following case: 

DJ promise: either x = y, or x and y differ in exactly n/2 positions 

Note that this promise only makes sense if n is an even number, otherwise re/2 would not be integer. 
In fact it will be convenient to assume n a power of 2. Here is a simple quantum protocol to solve 
this promise version of equality using only logn qubits: 

1. Alice sends Bob the logn-qubit state — 1= X^r=i( — ^) x ' K)> w hich she can prepare unitarily from 
x and logn |0)-qubits. 

2. Bob applies the unitary map \i) i— > (—l) Vi \i) to the state, applies a Hadamard transform to 
each qubit (for this it is convenient to view i as a log n-bit string) , and measures the resulting 
logn-qubit state. 

3. Bob outputs 1 if the measurement gave |0 logn ) and outputs otherwise. 

It is clear that this protocol only communicates logn qubits, but why does it work? Note that the 
state that Bob measures is 

/ n \ n 

ff ®io g n i J_j2(-i) x <+y*\i) ) = ± (-i)^'li) 

V v n i=i ) n i=l je{0,l} losn 

This superposition looks rather unwieldy, but consider the amplitude of the |0 logn ) basis state. It 
is \ X^I l =i( — l) x%+Vi , which is 1 if x = y and otherwise because the promise now guarantees that 
x and y differ in exactly n/2 of the bits! Hence Bob will always give the correct answer. 

What about efficient classical protocols (without entanglement) for this problem? Proving 
lower bounds on communication complexity often requires a very technical combinatorial analysis. 
Buhrman, Cleve, and Wigderson used a deep combinatorial result of Frankl and Rddl [62J to prove 
that every classical errorless protocol for this problem needs to send at least 0.007n bits. We give 
the details in Appendix IB.4I 

This logn-qubits-vs-0.007n-bits example was the first exponentially large separation of quantum 
and classical communication complexity. Notice, however, that the difference disappears if we move 
to the bounded- error setting, allowing the protocol to have some small error probability. We can 
use the randomized protocol for equality discussed above or even simpler: Alice can just send a few 
(i,X{) pairs to Bob, who then compares the x^s with his y,'s. If x = y he will not see a difference, 
but if x and y differ in n/2 positions, then Bob will probably detect this. Hence O(logn) classical 
bits of communication suffice in the bounded-error setting, in sharp contrast to the errorless setting. 

3.5 The Intersection problem 

Now consider the Intersection function, which is 1 if X\ = yi = 1 for at least one i. Note that 
this is a decision problem of the appointment-scheduling problem mentioned in the introduction. 
Buhrman, Cleve, and Wigderson [33J also presented an efficient quantum protocol for this. Their 
protocol is based on Lov Grover's famous quantum search algorithm [74], which we will briefly 
sketch here. 
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Suppose there is some n-bit string z and we would like to find an index i such that z% = 1. We 
cannot "look" at z directly, but we can apply the following unitary map: 

O z : \i) ^ (-lp\i). 

Grover's algorithm starts in a uniform superposition -j= Y^i=\ K) an d then repeatedly applies the 
following unitary Grover iterate to the state: 

G = H m °z n O Q H m °z n O z , 

where H®^ ogn is the logn-qubit Hadamard transform, and Oq is the unitary that puts a ' — ' in 
front of the all-0 state. Suppose there are exactly t solutions: t indices i where z% = 1. We will not 
give the analysis here (see for instance |24j ) . but one can show that after about js/n/i Grover- 
iterations, most of the amplitude of the state sits on such solutions. Measuring the state will now 
with high probability give us a solution. Of course we may not know t in advance, but there is a 
way to find a solution with high probability using 0(y/n) Grover-iterates even in that case. 

Now what about the Intersection problem? Note that we just want to find a solution for the 
string z = x A y, which is the bit-wise AND of x and y, since Zi = 1 whenever both Xi = 1 and 
yi = 1. The idea is now to let Alice run Grover's algorithm to search for such a solution. Clearly, 
she can prepare the uniform starting state herself. She can also apply H and Oq herself. The only 
thing where she needs Bob's help, is in implementing O z . This they do as follows. Whenever Alice 
want to apply O z to a state 

n 

\4>) = ^OLi\i), 
i=l 

she tags on her x% in an extra qubit and sends Bob the state 

n 

y^ai\i)\xi). 

i=l 

Bob applies the unitary map 

\i)\ Xi ) » {-l)** A w\i)\ Xi ) 

and sends back the result. Alice sets the last qubit back to |0) (which she can do unitarily because 
she has x), and now she has the state O z \4>)\ Thus we can simulate O z using 2 messages of log(n) + l 
qubits each. Thus Alice and Bob can run Grover's algorithm to find an intersection, using 0{^/n) 
messages of O(logn) qubits each, for total communication of 0(y/nlogn) qubits. Later Aaronson 
and Ambainis [lj gave a more complicated protocol that uses 0{y/n) qubits of communication. 

What about lower bounds? It is a well-known result of classical communication complexity 
that classical bounded-error protocols for the Intersection problem need about n bits of communi- 
cation [THJQTT]. Thus we have a quadratic quantum-classical separation for this problem. Could the 
separation be even bigger than quadratic? This question was open for quite a few years after [33] 
appeared, until finally Razborov |112j showed that any bounded-error quantum protocol for Inter- 
section needs to communicate about yfn qubits. His proof is beautiful but deep and complicated. 
We sketch it in Appendix [Cl 
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3.6 Raz's problem 

Notice the contrast between the examples of the last two sections. For the Distributed Deutsch- 
Jozsa problem we get an exponential quantum-classical separation, but the separation only holds 
if we require the classical protocol to be errorless. On the other hand, the gap for the disjointness 
function is only quadratic, but it holds even if we allow classical protocols to have some error 
probability. 

Raz [110J exhibited a function where the quantum-classical separation has both features: the 
quantum protocol is exponentially better than the classical protocol, even if the latter is allowed 
some error probability. Consider the following promise problem P: 

Alice receives a unit vector v S K m and a decomposition of the corresponding space in 

two orthogonal subspaces H^ ' and 

Bob receives an m x m unitary transformation U. 

Promise: Uv is either "close" to H(°> or to (more precisely, letting P be the 

projector on subspace H, a vector v is close to H if \\Pv\f > 2/3). 
Question: which of the two? 

As stated, this is a problem with continuous input, but it can be discretized in a natural way by 
approximating each real number by O(logm) bits. Alice and Bob's input is now n = 0(m 2 log to) 
bits long. There is a simple yet efficient 2-round quantum protocol for this problem: Alice views 
v as a logm-qubit vector and sends this to Bob. Bob applies U and sends back the result. Alice 
then measures in which subspace iJW the vector Uv lies and outputs the resulting i. This takes 
only 2 log to = O(logn) qubits of communication. 

The efficiency of this protocol comes from the fact that an TO-dimensional unit vector can be 
"compressed" or "represented" as a logm-qubit state. Similar compression is not possible with 
classical bits, which suggests that any classical protocol for P will have to send the vector v more 
or less literally and hence will require a lot of communication. This turns out to be true but the 
proof (given in [110J) is surprisingly hard. It shows that any bounded-error protocol for P needs 
to send at least about n l / A /\ogn bits. 

3.7 The Hidden Matching problem 

Consider the following promise problem HM from [9], for even integer n: 
Alice receives a string x G {0, l} n . 

Bob receives a perfect matching M on {1, . . . , n} (i.e., a partition into n/2 disjoint pairs 

M = • • • , (i n /2, in/2)})- 

Question: output a triple (i,j,Xi © Xj) for some (i,j) £ M. 

This communication problem is not a function, but a relation: for each input-pair x,M there 
are n/2 different correct answers instead of only one: (i,j,X{ © yi) is correct for each (i,j) € M. 
We consider one-way protocols here, where Alice sends one message to Bob and then Bob should 
produce a triple (i,j,Xi © Xj). 

We now describe a quantum protocol where Alice sends only O(logn) qubits and Bob gives one 
of the correct answers with probability 1 [9]. Alice sends Bob the following logn-qubit message: 
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Bob views M as an orthogonal decomposition of the space C n into n/2 2-dimensional subspaces. For 
instance, the projector for the subspace corresponding to € M would be = + \j){j\- 
Bob applies this measurement on the state he received, and obtains the label of some random 
(i,j) £ M as well as the projected state 

-J=((-ini) + (-i)^ij)). 

An appropriate measurement on this state will give Bob the bit X{ © Xj with certainty, and he can 
output the correct answer (i,j,Xi © Xj). 

What about classical protocols? First note that the HM problem can be solved by a short 
classical message from Bob to Alice: Bob sends Alice a pair £ M using 21ogn bits, which 

allows Alice to compute Xj © x~. But the situation is radically different if we consider classical 
one-way communication from Alice to Bob only. Indeed, one can show that if Alice sends Bob pairs 
(i,X{) for 0(y/n) randomly chosen i's, then Bob probably received both points from at least one 
pair in ik/d- This allows him to output a correct answer. On the other hand, Bar-Yossef, Jayram, 
and Kerenidis [9] proved that any classical protocol solving the Hidden Matching problem, even 
with small error probability and involving only one-way communication from Alice to Bob needs 
messages of length at least about y/n. Thus we have an exponential separation between classical 
one-way protocols and quantum one-way protocols. 

Variants of the Hidden Matching problem have been used recently to obtain other quantum- 
classical separations. For example, Gavinsky et al. [67] showed a log n-qubits- versus- y^-classical- 
bits separation for one-way protocols for a Boolean function derived from the Hidden Matching 
problem (while HM itself is a relational problem). Gavinsky [65] used another variant of HM 
to exhibit a relational problem where quantum one-way protocols are exponentially more efficient 
than classical two-way protocols. 

3.8 Inner product 

In the previous sections we gave examples of quantum-classical separations. The parameters were 
different, but in each case we showed that there was a quantum protocol for the problem at hand 
that required far less communication than the best classical protocols. Could this always be the 
case? Could quantum communication complexity be much more efficient for every communication 
complexity problem? The answer to this is negative — in fact for most communication complexity 
problems, quantum communication does not help much. 

An important example is the inner product function (IP(x,y) = x ■ y = ^d=\ x iVi ( m od 2)). 
All protocols, both classical and quantum, need to send about n bits/qubits to solve this. We will 
sketch the proof of [46] here for the case of errorless quantum protocols with qubit communication 
and without entanglement, the proof for the more general case of entanglement is slightly more 
complicated. The proof uses the IP-protocol to communicate Alice's n-bit input to Bob, and then 
invokes Holevo's theorem to conclude that many qubits must have been communicated in order to 
achieve this. Suppose Alice and Bob have some protocol P for IP. They can use this to compute 
the following mapping^: 

\ X )\y)^\ X )(-lfy\y). (16) 

6 This is due to an effect called the "birthday paradox" or "birthday problem" . It states that if we throw roughly 
^fn balls into n bins at random, then probably there will be a bin containing at least two balls. 

7 This is an oversimplification of matters: in order to get the map of Eq. (|16fl one first needs to construct a new 
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Now suppose Alice starts with an arbitrary n-bit state \x) and Bob starts with the uniform super- 
position J2y£{o i}« \y)- If they apply the above mapping, the final state becomes 

i*>i E (-!)■%>■ 

y e{o,i} n 

If Bob applies a Hadamard transform to each of his n qubits, then he obtains the basis state \x), 
so Alice's n classical bits have been communicated to Bob. Holevo's theorem now implies that the 
IP-protocol must communicate n qubits (which can trivially be achieved). The same argument 
can, with a minor modification, be made to work even if Alice and Bob share unlimited prior 
entanglement, yielding a lower bound of n/2 qubits (which can trivially be achieved using dense 
coding). With some more technical complication, the same idea gives an |(1 — 2e) 2 n lower bound 
for e-error protocols [46j. The constant factor in this bound was subsequently improved to the 
optimal i by Nayak and Salzman |100j . 

4 Non-locality and Communication Complexity 
4.1 Converting communication complexity to non-locality 

In Section [2] we introduced several simple non-locality scenarios. Then in Section [3] we introduced 
communication complexity, and gave several problems for which there are large, sometimes expo- 
nential, separations between the classical and quantum communication complexity. In this section 
we shall put together these two approaches, and derive from the communication complexity prob- 
lems new non-locality problems which are very hard, sometimes exponentially hard, to solve in 
a classical model. In particular we shall present non-locality problems based on the Distributed 
Deutsch-Jozsa problem and on the Hidden Matching problem. In Section [7] we shall come back to 
these non-locality problems, and will discuss these newly developed tests in the context of experi- 
mental errors. 

In this section we shall use the following mapping which, when applicable, is very powerful. 

Mapping one-way quantum communication complexity to non-locality. 

Consider a communication complexity problem where the number q of qubits exchanged 
in the quantum communication model with one-way communication from Alice to Bob 
is less than the number c of bits required to solve the problem classically when the 
parties have shared randomness; and further suppose that — due to some symmetry of 
the problem — it can be solved if Alice starts with an arbitrary basis state \k) (the value 
of k being known beforehand to both Alice and Bob) as follows: she carries out a 
transformation Ua(x) on this state (that depends on her input x but does not depend 
on k), sends it to Bob who carries out a transformation Ub{v) (that depends on his 
input y but does not depend on k) and then measures in the computational basis. The 
probability of finding result £ is thus | (£\UB{y)UA(x)\k}\ 2 . From the knowledge of £, k, 
and y, Bob can find the value of the function f(x,y). 

protocol P" 1 which is the reverse of the original communication protocol P. This can be done without error because 
the original protocol is without error. Combining protocols P and P _1 one can obtain map (I16[) . If protocol P uses 
c qubits of communication, protocol P _1 also uses c qubits, and the protocol for obtaining state (|16p uses 2c qubits. 
But the crucial point is that still at most c qubits are sent from Alice to Bob, since P _1 is the reverse of P. Holevo's 
theorem lower bounds the communication from Alice to Bob, and hence we get a lower bound of n qubits on c. 
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Now consider the following process: Alice and Bob share a maximally entangled state 
\tp) = 2~ q / 2 X^lo 1 Alice carries out a local transformation Ua(x) t (where 'T" 

means transposition in the |i)-basis); she measures in the computational basis. Bob 
carries out the transformation Ub(u)', he measures in the computational basis. Suppose 
that Alice obtains outcome k and Bob obtains outcome I. The probability of finding 
these joint outcomes is P(k,£\x,y) = \{£\{k\U B (y)U A (x) T \ip)\ 2 = 2- q \(£\U B {y)U A {x)\k)\ 2 
(the last equality is easy to check) . If Alice now sends to Bob the outcome k of her mea- 
surement (which requires q bits), then Bob can compute f(x,y). Thus this constitutes 
a solution of the communication complexity problem in the entanglement model with 
half the communication that would be required if they had used the trivial mapping 
based on teleportation. More importantly, the correlations P(k, £\x,y) are non-local, 
since they could not be obtained in a classical model with shared randomness without 
at least c — q > bits of classical communication. 



4.2 Non-local version of the Distributed Deutsch-Jozsa problem 

The above mapping can be applied to the Distributed Deutsch-Jozsa problem from Section [37 
We describe here the result of the mapping. 

Non-local DJ problem: Alice and Bob receive n-bit inputs x and y that satisfy the 
DJ promise: either x = y, or x and y differ in exactly n/2 positions. The task is for 
Alice and Bob to provide outputs a, b € {0, l} logn such that when x = y then a = b, 
and when x and y differ in exactly n/2 positions then a^b. 

They achieve this as follows 

rt-1 



1. Alice and Bob share the maximally entangled state —= y H)\i) 

2. Alice and Bob both apply locally a conditional phase to obtain: —= y~^( — l) Xi (— l) Vi \i)\i 

- n— In— 1 / n \ 

3. Alice and Bob both apply a Hadamard transform: —= ^ ^ j^(_i)*<+w-H-(<»©&) | a ^|^_ 



n-l 



rt— 1 rt— 1 
v a=0 6=0 \i=l 

4. Alice and Bob measure in the computational basis. 
For every a, the probability that both Alice and Bob obtain the same result a is: 



n-l 



2 



which is 1/n if x = y and otherwise. Hence this solves the problem. 

Note that if Alice then communicated the result of her measurement to Bob (using log n bits) , he 
could solve the Distributed Deutsch-Jozsa problem since he could then check whether k = t or k ^ i. 
But we know that solving the Distributed Deutsch-Jozsa problem requires at least 0.007ra bits. Thus 
we have a non-locality problem that can be solved if Alice and Bob share logn ebits, but which 
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requires about 0.007n bits to be solved in a classical model with shared randomness and classical 
communication. Note that this very large lower bound on the amount of classical communication 
would disappear in the bounded-error setting where we allow the correlations P(a,b\x,y) to differ 
slightly from the ideal correlations. 

4.3 Non-local version of the Hidden Matching problem 

The same mapping can be applied to the Hidden Matching problem to yield a non-locality problem. 

Non-local HM problem: Assume that n = 2 m , so we can index the numbers between 
1 and n with m-bit strings. 

Alice receives a string x G {0, 1}™. Bob receives a perfect matching M on {1, . . . ,n} 
(i.e. a partition into n/2 disjoint pairs). 

Alice must give as output some k G {0, l} m . Bob must give as output a matching 
G M and I G {0, l} m . 

Alice and Bob's output must satisfy i ■ (k © £)) +j ■ (k © £) = Xj + xj mod 2 (recall that 
a • b = a ibi is the inner product between bitstrings a and b, and a © b is the bitwise 
XOR of a and b: the ith bit of a © b is aj © bi). 

Note that if at the end of the protocol, Alice sends k to Bob at a cost of m = log n classical 
bits, then Bob has enough information to compute the triple (i, j, Xi © Xj), i.e., to solve the Hidden 
Matching problem as defined in Section 13.71 But we know that classical one-way communication 
from Alice to Bob needs about \fn bits to solve the Hidden Matching problem. Therefore the 
correlations in the non-local HM problem themselves can only be reproduced if Alice sends Bob at 
least about ^fri bits of communication (if we are restricted to one-way). 

Let us show that Alice and Bob can obtain the correlations of the non-local HM problem using 
local measurements on m = logn ebits. The initial state is: 



4= E km- 

V n . fzl, 

* „v- in 1 1 m. 



ie{o,i} 

Alice adds the phases (— l) Xi . Bob views M as an orthogonal decomposition of the space C n 
into n/2 2-dimensional subspaces. For instance, the projector for the subspace corresponding to 
G M would be Py = + Bob applies this measurement on the state he received, 

and obtains the label of some random (i,j) G M. This projects the joint state to 

_L((-i)*«|t>li> + (-i)^lj>lj». 

Now they both apply Hadamard transforms to each of their qubits. This gives the state 

(-if* (-if k + u \k)\E) + izlLi y (-iy- k +j- £ \k)\£) 

n z— ✓ n ' , 

fc,^e{0,l} m k,££{0,l} m 

Both parties measure their half of the state in the computational basis. They obtain m-bit strings 
k and £, respectively, satisfying xi + i • (k®i) = Xj+j ■ (k®£) (modulo 2), since the other fc,£-pairs 
have amplitude 0. This gives: i ■ (k © £) + j ■ (k © £) = + Xj (modulo 2). 
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5 Quantum Fingerprinting and the Simultaneous Message Passing 
Model 



We now describe a model, called the simultaneous message passing (SMP) model, that is neither 
a non-locality test nor the full-fledged communication complexity scenario, yet that is relevant to 
both. The basic structure is illustrated in Fig. [5j Alice and Bob each receive an n-bit input {x and 

Inputs: x £ {0, l} n V e {°> l T 

I I 




Output: f(x,y) 



Figure 5: The simultaneous message passing variant of the communication complexity scenario: 
Alice and Bob receive n-bit strings, x and y respectively, as input and their communication is 
restricted to each sending one message to a third party, called the Referee. From these messages, 
the Referee computes some function f(x,y) as the output of the protocol. There are tasks of 
this form where communication in terms of quantum messages is exponentially more efficient than 
communication in terms of classical messages. 

y, respectively). In this scenario, they do not have any shared resources like shared randomness or 
an entangled state, but they do have local randomness. They each are required to send a single 
message to a third party, called the Referee. The Referee, upon receiving message rriA from Alice 
and niB from Bob, should output the value of some (Boolean) function f(x,y). The goal is to 
compute f(x,y) with a minimum amount of communication from Alice and Bob to the Referee. 
This scenario was introduced by Yao [136] for the setting where txia and mg are classical messages 
consisting of bits. We compare this classical model to the corresponding quantum version, where 
m,A and tub consist of qubits. We will see that for the very natural problem of equality, where 
f(x,y) = 1 if and only if x = y, there is an exponential savings in communication when qubits 
are used instead of classical bits. Classically, the problem of the bounded-error communication 
complexity of equality in the SMP model was open for almost twenty years, until Newman and 
Szegedy |102| exhibited a lower bound of about y/n bits. This is tight, since Ambainis [4] constructed 
a bounded error protocol for this problem where the messages are 0{^fn) bits long (we describe 
a slightly less efficient classical protocol in Section l5.2p . In contrast, Buhrman, Cleve, Watrous, 
and de Wolf [32J showed that in the quantum setting this problem can be solved with very little 
communication: only O(logn) qubits suffice. 
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5.1 Quantum fingerprints 



In order to construct the efficient quantum SMP protocol for equality, we need to borrow ideas from 
the efficient classical randomized communication complexity protocol for equality from Section 13.11 
Recall that in that protocol, Alice interprets her input x as a polynomial p x (t) = Ya=i x i^ % ~ 1 over 
some finite field F of size m (about 3n), and then she picks a random point a £ F and sends a and 
p x (a) to Bob. The pair a,p x (a) is called a "fingerprint" of x, since it describes characteristics of x 
that can aid in identifying it. Carrying out this fingerprinting procedure in superposition results 
in a quantum fingerprint of x: 

Note that \F X ) consists of only 21ogm = 21ogn + 0(1) qubits. 



5.2 Classical protocol for equality 

A nearly optima^ 0(^/nlogn) classical protocol for equality in the SMP model goes as follows. 
Alice produces a list of k = 0{^/n) random points ax, . . . , in F and sends the list {(oi,Pz(«i))}i=i 
to the Referee. Bob does the same with respect to y, sending {(bi,p y (bi))}f =1 to the Referee. By 
the birthday paradox (see the footnote in Section [3 .7p . with constant probability there exist i and 
j such that both a% and bj equal the same field element d. In this case the Referee can compare 
p x (d) with p y (d). If x = y then p x = p y , and hence p x (d) = p y (d). On the other hand, if x ^ y, 
then since p x and p y are different polynomials of degree at most n — 1, with probability > 2/3, 
we have p x (d) ^ Py{d). The protocol for the Referee is now clear: if the lists of Alice and Bob 
have a point d in common, then the Referee outputs 1 if and only if p x {d) = p y {d). If there is no 
point in common (which happens only with small probability) or if p x (d) ^ Py{d), then the Referee 
outputs 0. 



5.3 Quantum protocol for equality 

We now have everything in place to describe the quantum protocol for equality. Alice sends state 
\F X ) to the Referee and Bob sends \F y ). Note that if the Referee now measures \F X ) in the com- 
putational basis, then he will find a random point a and the value p x (a), just like the classical 
protocol described above. The Referee thus needs to do something smarter. The key observation 
is the following about the inner products between fingerprints: 

<w-{<. tt t 'y < i? > 

[ < 3 ifx^y 
If x = y then clearly (F x \F y ) = 1. If x ^ y then 

(F x \F y ) = 1 = -|-Z>*(i)M0>. 

i,je¥ ie¥ 



Since p x and p y are different polynomials of degree at most n — 1, they have the same value 

l 

3" 



Px(i) = Py(i) f° r a t most n — 1 values of i. Hence the inner product is at most ^— < 1 



Ambainis's protocol from [4] gets rid of the logn factor. 
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When Alice and Bob send their quantum fingerprints to the Referee, he has to determine the 
inner product between the two states he receives. The following test (Figured]), sometimes called 
the SWAP-test, accomplishes this task with a small error probability. 



|0). 



H 



H 



measure 





SWAP 













Figure 6: Quantum circuit to test if 



or 



< 



This circuit first applies a Hadamard transform to a qubit that is initially |0), then SWAPs 
the other two registers conditioned on the value of the first qubit being |1), then applies another 
Hadamard transform to the first qubit and measures it. Here SWAP is the operation that swaps 
the states \4>) and \ip): \4>)\ip) i— ► \ip)\<p). The Referee receives \4>) from Alice and from Bob 
and applies the test to these two states. An easy calculation reveals that the outcome of the 
measurement is 1 with probability (1 — \((j)\ip}\ 2 ) /2. Hence if |c/>) = \ip) then we observe a 1 with 
probability 0, but if 1(^1^)1 < \ then this probability is > |. Repeating this procedure with several 
individual fingerprints can make the error probability arbitrary close to 0. 



5.4 Subsequent work in the SMP model 

After the quantum fingerprinting scheme showed the power of quantum communication in the SMP 
model, a number of further results appeared. Yao |138j exhibited an efficient protocol for testing 
if the inputs x and y are at some constant Hamming distance d, while Gavinsky et al. [69] related 
quantum fingerprinting to a technique from machine learning which brings out its weaknesses. One 
can also study the variant of the SMP model where Alice and Bob start with a shared entangled 
state, but can only send classical messages to the Referee. Gavinsky et al. [68] exhibited a problem 
based on the Hidden Matching problem and a quantum protocol that solves it with O(logn) ebits 
and O(logra) classical bits of communication, while any quantum SMP protocol without prior 
entanglement needs to send at least about (n/logn) 1 / 3 qubits. This shows that entanglement 
can reduce communication (even quantum communication!) exponentially, at least for relational 
problems in the SMP modellfl Finally, Gavinsky, Regev, and de Wolf [70] showed that if Alice's 
message to the referee is allowed to be quantum, while Bob's message can only be classical, then the 
quantum advantages over purely classical protocols mostly disappear. In particular, the equality 
problem requires communication at least \Jnj logra in this hybrid case. 

9 Recently, Gavinsky [66] extended this to a similar separation in the more standard two-way model. 
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6 Other Aspects of Quantum Non-Locality 



6.1 Non-local boxes 

In previous sections we studied a hierarchy of resources. In particular, we discussed and compared 
the correlations P(a,b\x,y) that can be obtained using only shared randomness, by local measure- 
ments on entangled states, and finally those that can be obtained if communication between the 
parties is allowed. In this section we discuss an interesting set of correlations that lie between the 
last two classes. 

To understand these new correlations, let us note that any correlations P(a,b\x,y) obtained 
in a local hidden variable model or by local measurements on an entangled state must obey the 
following properties: 

Positivity: P(a, b\x, y) > 0; (18) 
Normalization: ^^P(a,b\x,y) = 1; (19) 

a.b 

No Signalling: ^~]P(q, b\x, y) = P(a\x) is independent of y, 
b 

P(a, b\x, y) = P(b\y)is independent of x. (20) 

a 

The last condition expresses the fact that Bob cannot transmit any information about his input 
y to Alice, and similarly Alice cannot communicate to Bob any information about her input x. 
We are interested here in correlations that obey the above three conditions, but that cannot be 
obtained from local measurements on entangled states. 

To illustrate this idea, suppose that Alice and Bob each have some kind of device (introduced 
independently in [79] and in |108j ) such that Alice can provide an input x G {0, 1} to her device 
and obtain an output a £ {0, 1}; and Bob can provide an input y £ {0, 1} to his device and obtain 
an output b £ {0, 1}, and such that the probabilities of the outputs given the inputs obey 

„, x / \ if a® b = x A y , . 

P(a, b\x, y) = < x ^ . (21) 
v 1 y ' \ otherwise. v ; 

Note that, much like the correlations that can be established by use of quantum entanglement, 
this device is atemporal: Alice gets her output as soon as she feeds in her input, regardless of if 
and when Bob feeds in his input, and vice versa. Also inspired by entanglement, this is a one-shot 
device: the correlation appears only as a result of the first pair of inputs fed in by Alice and Bob. 
This device obeys the conditions 1 to 3 above, so it cannot be used to signal. We call it a non-local 
(NL) box (other terminology in use is Popescu-Rohrlich (PR) box, in reference to jl08j ). 

With this device Alice and Bob always obtain a © b = x A y, whereas we know that for local 
measurements on entangled quantum states this relation can only be satisfied with probability at 
most cos 2 (7r/8) under the uniform distribution on the inputs x and y (see Section [231 for a proof). 
Thus this is an "imaginary" device in the sense that it cannot be realized physically without Alice 
and Bob's devices being connected by some kind of communication channel. It is, however, an 
interesting resource to consider, since it is "stronger" than correlations that can be obtained from 
local measurements on entangled states, but "weaker" than actual communication. 
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A systematic study of the properties of correlations obeying the above three conditions was 
initiated in [12] . and it was shown that they obey properties that one thinks of as genuinely 
quantum, such as monogamy and no-cloning [86J. They also allow for secure key distribution [11] . 

Because of the apparent "reasonableness" of the non-local box, Popescu and Rohrlich raised the 
question (in [108J, and in fact well before this) why such correlations cannot be realized in nature 
without communication between the parties. The most straightforward answer is the technical 
proof in Section 12.31 however, one might seek a more intuitive or philosophical explanation. One 
possible approach is provided by communication complexity. It was shown by van Dam |50|, 151]. 
and also noted by one of the authors of the present review (Cleve), that if Alice and Bob have an 
unlimited amount of non-local boxes then all communication complexity problems become trivial: 

Suppose Alice and Bob have an unlimited supply of non-local boxes, as described in 
Eq. (|2~T1) . Suppose Alice receives input x 6 {0, 1}" and Bob receives input y £ {0, l} n . 
Then communication complexity becomes trivial, in the sense that the value of any 
Boolean function f(x,y) £ {0, 1} can be computed with certainty with a single bit of 
communication from Alice to Bob. 

To prove this, consider an arbitrary function / : {0, l} n x {0, l} n — > {0, 1}. It can be expressed 
as a boolean circuit consisting of NOT and A (and) gates, with inputs x%, . . . ,x n and y\, . . . ,y n . 
The idea is to represent the value of each gate of this circuit in terms of two shares, one possessed 
by Alice and the other by Bob. For a bit a, its representation as shares is any (a', a") where 
a = a! ® a" . Until the end of the protocol, Alice's information about each gate will be just the first 
bit of its share and Bob's information will be the second bit. They start by constructing shares of 
the input bits: (xj,0) for each of Alice's input bits Xi (Bob does not need to know Xi to construct 
his share 0); and similarly (0,y{) for each of Bob's input bits y^. For each gate in the circuit, if 
Alice and Bob collectively know the input bits as shares then they can produce the shares for the 
output bit without any communication. For each not gate, Alice merely negates her share (and 
Bob does nothing to his share). For each A gate, assume that the shares of inputs are (a 1 , a") and 
(6', b"). The shares of the output should be (c', c") such that 

c > © c" = (a' © a") A {b' © b") = (a' A b') © {a' A b") (a" A b') © (a" A b") . (22) 

Consider the four terms arising above. Since Alice possesses a' and b' , she can easily compute 
a' Ab' , and similarly Bob can compute a" A b" . The difficult terms are a' A b" and a" A b' because 
they contain bits that are spread between Alice and Bob — and this is where the non-local boxes 
are used. Alice and Bob use one non-local box to obtain bits d! and d" so that d' © d" = a' A b" . 
They use a second non-local box to obtain e' and e" so that e' © e" = a" A b' . Then Alice sets her 
share to d = (a' A b') © d' © e' and Bob sets his share to c" = (a" A b") © d" © e" . Clearly, 

c > © c " = (a' A b') © {d'@ e") © (d'ffi e") © (a" A b") = (a' A b') © (a' A b") © (a" A b') © (a" A b") , (23) 

as required. At the end, Alice and Bob possess shares for the value of /, and Alice sends her one-bit 
share to Bob, enabling him to compute the value of /. 

Is this result specific to the non-local boxes of the form Eq. (|2ip (in which case it could be 
viewed as some kind of anomaly in the space of all possible no-signalling correlations), or does it 
hold for other no-signalling correlations? In particular, does it hold for noisy correlations? It was 
shown in [22] that the latter is the case, if one slightly adapts the definition of what it means for 
communication complexity to be trivial: 
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becomes trivial, in the sense that there exists q > 1/2 (possibly depending on p, but on 
no other parameter) such that, for any re > 0, if Alice receives input x £ {0, l} n and 
Bob receives input y £ {0, l} n , then they can find with probability at least q the value 
of any Boolean function f(x,y) £ {0, 1} with a single bit of communication from Alice 
to Bob. 

Note that this result does not hold if Alice and Bob share entangled states instead of (noisy) non- 
local boxes. Indeed this follows from the result of [36], discussed in Section 13.81 that computing 
the inner product of two re-bit strings with success probability q > 1/2 requires 0(n) bits of 
communication, even if Alice and Bob have an unlimited supply of entangled particles. 

Thus the fact that communication complexity is not trivial (i.e., that some communication 
complexity problems are hard whereas others are easy) can be viewed as a partial characterization 
of the non-local correlations that can be obtained by local measurements on entangled particles. 
Is this a complete characterization? In particular, what is the exact noise threshold p where non- 
local boxes with noise p render communication complexity trivial? The current bounds on p are: 
85.4% Fa < p < Fa 90.8%. If the lower bound is the correct one, we would have 

an interesting answer to the question raised by Popescu and Rohrlich. We leave this as an open 
problem. 

Another related open question arising by analogy with the process of entanglement purifica- 
tion [18j, is whether it is possible to "purify" non-local boxes? That is, given a supply of non-local 
boxes that work correctly with probability p, is it possible to produce, using only local operations, 
a non-local box with a success probability greater than pi For a first step in this direction, see [61] . 

6.2 Bell inequalities and Tsirelson bounds 

As discussed in the previous section, there are correlations, such as the non-local box, that cannot be 
reproduced by local measurements on entangled particles, but that nevertheless obey the conditions 
of positivity, normalisation and no-signalling Eqs. (|18l [19l [20|) , More generally, we would like to 
understand within the space of all possible correlations P = {P(a,b\x,y)} which ones can be 
obtained by using only shared randomness (i.e., by local hidden variable models), which ones can 
be realised by carrying out local measurements on entangled particles, and what are the ultimate 
limits set bv Eas. (f!8l [T9l |20"1) . 

Answering this question would address the question raised by Popescu and Rohrlich mentioned 
above, and would give us basic insights into communication complexity. Indeed it would allow us 
to understand quantitatively the differences between shared randomness, shared entanglement, and 
non-local correlations, each of which can be viewed as a different resource for communication com- 
plexity. For instance answering this question can have immediate implications for communication 
complexity in the entanglement model, at least in the case where Alice and Bob use only one round 
of communication. 

Before addressing this question it is useful to understand better the geometry of non-local cor- 
relations. To this end we introduce Bell expressions, that is linear combinations of the correlations 




(24) 



abxy 
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where c a \, xy are real numbers. It is easy to show that the space of correlations that can be reproduced 
using local hidden variables (i.e., using only shared randomness) is a polytope. That is, it can be 
characterised by a finite number of inequalities, called Bell inequalities, of the form 

C(P) < C LHV • (25) 

To compute the maximum value allowed by local hidden variable (LHV) models, we can restrict 
ourselves to deterministic models, where a = a(x) is a function of input x, and b = b(y) is a function 
of y. We then have 

C LHV = max 2^ c a(x)b(y)xy ■ 

If we consider local measurements on entangled quantum states, then we have bounds of the 
form 

C(P) < Cqm ■ (26) 

where 

C QM = max ^2 c a bxy{ip\^a(x) <8> n b (y)\ip) 

abxy 

where the maximum is taken over all states and over all projective measurements {II a (a;)} 
(depending only on x) and projective measurements {Hb(y)} (depending only on y). (By projective 
measurements, we mean a set of projectors II a = 11^ that sum to the identity ^ a IT a = /). Recently 
it has been shown how the quantum value C LH v could be bounded by a hierarchy of semidefinite 
programs |97 [ I98 [ 157]. although the issue of whether this hierarchy converges remains open |117j . 
If we impose only the no-signalling conditions, then we will have 

C(F > ) — Cno-signalling • (27) 

where the right hand side is the maximum of Eq. (|24"1) subject to Eqs. ([TBI [T9] [20]) . Note that 
Eqs. (|18[ [T9l [20]) define another polytope, the no-signalling polytope, and the maximum value of 
C(P) will be attained at a vertex of the polytope. 

Let us illustrate the above concepts by a specific kind of Bell expression, called XOR non-local 
games [47] . In this particular case, the outputs a, b £ {0, 1} are bits and we wish them to come as 
close as possible to satisfying a condition of the form 

a®b = f(x,y) (28) 

for all x, y. The most celebrated example is the CHSH case, where x, y are also bits and the 
condition is a © 6 = x A y, see Eq. ©• 

In the case of XOR games, we take the constants c a b X y in Eq. (I24p to have the form: 

Cabxy = w xy (-l) a(Sb ®f^ = m xy (-l) a(Sh (29) 

where w xy > can be thought of as the weight we give to the pair of inputs x, y, and m xy = 
w xy {—l)^ x,v \ In the particular case of the CHSH expression, we take m xy = (—l) xAy , resulting in 
the famous CHSH inequality. 

When considering LHV theories, it is convenient to define new variables A x = (— l) a ( x ) and 
B y = (— l) b ( y \ whereupon the maximum value of the Bell expression reachable by LHV theories is 

Clhv = max y m xy A x B y 

* xy 
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In the case of local measurements carried out on entangled quantum states, we can write 

P(a, y)(-l) a (-l) fe = (ip\A x ® B y \$) 

a,b 

where is the quantum state shared by Alice and Bob, and A x , B y are Hermitian operators with 
eigenvalues in {+1, —1}- We now use the following result of Tsirelson [129j : 

Suppose Alice and Bob measure observables A x and B y , both with eigenvalues in 
{+1,-1}, on a pure quantum state \ip) G C d © C d , then there are real unit vectors 
a(x),/3(y) e M. 2d2 such that for all x and y, (ip\A x <g> B y \ip) = a(x) ■ (3(y). 

Thus we can re-express the maximal value of C attainable by quantum mechanics as 

C QM = max V ]m xy a(x) ■ (3{y) . 

xy 

If we impose only the no-signalling conditions, then it is possible to satisfy Eq. (j28[) for all x, y 
by choosing P(ab\xy) = 1/2 if a© 6 = f(x,y), P(ab\xy) = if a© 6 ^ f(x,y). Hence the maximum 
value of the game is 

Cno-signalling — ^ ^ l^xj/l • 
xy 

As illustration, in the case of the CHSH inequality, the results of Section l2~?2l can be re-expressed 
as stating that C LHV = 2 and C QM = 2\/2 and C no _ signa ii ing = 4. 

Interestingly, the ratio between the LHV values and the quantum value can be bounded inde- 
pendently of the number of inputs x, y and the choice of matrix m xy by Grothendiek's constant 
Kq, as first noted by Tsirelson [129J: 

Cqm 5; KqCisv ■ 

A recent development of this line of work is the realisation that for certain Bell inequalities, a 
violation larger than a critical value C(P) > Cd guarantees that if the correlations are obtained 
by local measurements on an entangled quantum state, then the state belongs to a Hilbert space 
of dimension at least d 2 (i.e., Alice and Bob's space each have dimension at least d) [30 |, 1131} 25J). 
These Bell inequalities can thus be thought of as "dimension witnesses" . 

6.3 Classical simulation of quantum correlations and quantum communication 

Consider a non-locality experiment in which Alice and Bob share an entangled quantum state 
and carry out local measurements on this state; or consider a quantum communication protocol 
in which Alice and Bob carry out several rounds of quantum communication and then carry out 
measurements on the quantum states. How much classical resources are required to reproduce 
these quantum experiments? The results from Sections [3] and 0] show that the classical resources 
must sometimes be larger, even exponentially larger, than the quantum resources. Is this the worst 
one can expect? What are good protocols to simulate the quantum experiments with classical 
resources? In this section we review progress on these questions. Note that we are of course not 
claiming that Nature works as in these simulations, but rather we are studying how one could 
mimic Nature with these alternative resources. 
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6.3.1 When no communication is needed. 



When states are very noisy, it may be possible to simulate local measurements on them using only 
shared randomness, even though the states are entangled. Werner's discovery of a family of states, 
now known as Werner states, for which such a simulation is possible [133] is one of the results of 
quantum information. Werner's model was restricted to local projective measurements. Later im- 
provements include [3] , and [10] where it was shown that simulations using only shared randomness 
can also exist when considering the more general case of local Positive Operator Valued MeasureJ^l 
(POVMs), which are the most general kind of measurement allowed by quantum mechanics. 



6.3.2 One-way quantum communication. 

Let us first consider the very simple scenario where Alice wants to communicate a single qubit to 
Bob and Bob wants to carry out a projective measurement on the qubit. We can formalise this 
simple scenario as follows: 

Simulation of one-way communication of a single qubit and subsequent pro- 
jective measurement. Alice receives as input a normalized vector x G M 3 , with length 
[|a?|| = 1, which describes the quantum state p = I + x ■ ^ where a = (X, Y, Z) is the 
vector of non-trivial Pauli matrices from Eq. (|14[) ; Bob receives as input a normalized 
vector y £ IR 3 , which describes his projective measurement y • a. Bob must output a bit 
b, with probabilities satisfying P(b = 0\xy) — P(b = l\xy) = Ti(py ■ a). 

We can generalise this to the case where Alice sends n qubits to Bob, and Bob carries out a 
POVM on the n qubits: 

Simulation of one-way communication of n qubits. Alice receives as input the 
classical description of a quantum state l^), for instance by giving her the values of 
the coefficients of the state in a standard basis \ip) = Y2i c iV)- And Bob is given the 
classical description of a measurement, for instance by giving him the matrix elements 
of the POVM elements Ak in the standard basis. The task is for Bob to provide an 
outcome k, such that the probability of outcome k occurring is P{k\ip) = (iplA^ip) . 

These are communication complexity scenarios where Alice and Bob's inputs are infinite- 
dimensional. If one allows for slight imperfections in the simulation, then one can truncate the 
description of the matrix elements of \tp) and A^, and make the number of input bits finite. For 
instance on Alice's side, if corresponds to the quantum state of n qubits, then one can truncate 
the number of inputs to 0(n2 n ) bits (by describing each coefficient o L with 0(n) bits of precision). 
If Alice then sends her truncated input to Bob, then we have, up to a small error, a classical 
simulation (using 0(n2 n ) bits) of any one-way quantum communication protocol in which n qubits 
are sent from Alice to Bob. One cannot hope to do much better than this, since the HM prob- 
lem of Section 13.71 exhibits an n versus 2 r2( - v/ ™) gap between the quantum and classical one-way 
communication complexity (and this was further strengthened to two-way classical communication 
complexity in |65j). 

10 A Positive Operator Valued Measure (POVM) is a set {^U} of positive-semidefinite matrices that sum to identity: 
Ak = I- When applied to quantum system in state p, the probability of obtaining measurement outcome k is 
Tr(A k p). 
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6.3.3 Entanglement simulation 

We can also consider the case where Alice and Bob want to simulate local measurements on entan- 
gled quantum particles. The simplest non-locality scenario occurs when Alice and Bob carry out 
projective measurements on a single ebit: 

Simulation of projective measurements on a single ebit. Alice and Bob each 
receive as input a normalized vector in 1R 3 , x, y with ||z|| = \\y\\ = 1, which describe 
their projective measurements x ■ <?, y ■ a. Alice and Bob must each output a bit (a, 6, 
respectively) such that the correlations obey 

P(a = b\x, y) — P(a ^ b\x, y) = —x ■ y = (ip- \x ■ a <S> y ■ &\ifi-} , 

where |^_) = (|0)|1) - \1)\0))/V% and such that the marginals, P(a\x, y) and P(b\x, y), 
are uniform (i.e., P{a = 0|x, y) = P(a = l\x, y) = 1/2, etc.). 

This can be generalized to the case where Alice and Bob carry out POVM's on arbitrary 
entangled states of n qubits: 

Simulation of entangled states of dimension 2™. Alice and Bob share a classical 
description of a pure entangled quantum state \iP)ab, where Alice and Bob's systems 
are each of dimension 2 n . Alice and Bob receive as inputs x,y the classical (infinite- 
dimensional) descriptions of the measurements they should do (for instance the inputs 
could consist in the matrix elements of the POVM elements in a standard basis). Alice 
and Bob must provide outputs a, b such that the joint probability P(a, b\x, y) equals the 
probability of getting measurement outcomes a and b when measurements x and y are 
carried out on state \iPab}- 

If we have a simulation of one-way quantum communication, then we can transform it into 
a simulation of entanglement. To see this, note that one can rewrite the joint probabilities as 
P(a, b\x, y) = P(a\x)P(b\x, y, a). The simulation is then as follows: Alice chooses a according to the 
probability distribution P(a\x); she then sends Bob sufficient information so that he can choose an 
output b distributed according to P(b\x, y, a). It is easy to show that for this second task (producing 
b distributed according to P(b\x,y,a)) it suffices for Alice to send Bob the measurement outcome, 
and to describe to him the state onto which his system is projected after Alice's measurement I 11 ] 
Using this correspondence, we thus have a protocol which provides, up to a small error, a classical 
simulation (using 0{n2 n ) bits of one-way communication) of any measurement on entangled states 
of n qubits. 

6.3.4 Exact classical simulations 

Remarkably it is also possible, at least in some cases, to perfectly simulate the quantum commu- 
nication or quantum entanglement scenarios with finite classical communication. In such perfect 
simulations we do not tolerate any error. Of course such exact simulations are in principle not nec- 
essary if one wants to interpret the results of real experiments, as any real experiment will always 
have small imperfections. But these exact simulations are interesting for at least two reasons. On 

11 We can assume without loss of generality that Alice's POVM elements all have rank 1, which implies that 
conditional on the measurement outcome, Bob's state is pure. 



37 



the one hand they show that perfectly simulating quantum systems is not much more costly than 
approximately simulating them. On the other hand, these exact simulations have quite interesting 
structures. One can hope that understanding these structures will help us understand the power 
(and limitations) of quantum communication. 

Exact classical simulations of quantum correlations were first independently reported in [92], [23] 
and [124J. Here we review briefly the subsequent works on this topic. 

We first consider a weak model, where the average amount of classical communication is bounded 
(but in the worse case the amount of classical communication may be infinite). This model was 
first used in [92 11124] in the context of classical simulation of a single ebit. In [89] this approach was 
generalized to the simulation of communicating n qubits, or the simulation of POVM measurements 
on n ebits, using 0(n2 n ) bits of two-way classical communication on average. 

A stronger and more interesting model is when the amount of classical communication is 
bounded (even in the worst case). This model was introduced in [23J. The simulations were 
improved, and in [127] it was shown that the classical simulation of projective measurements on a 
single ebit could be realized with a single bit of classical communication from Alice to Bob, and the 
communication of a single qubit could be simulated with 2 bits of communication. Note that these 
simulations use an infinite amount of shared randomness, a requirement that was shown in [89] to 
be necessary when the amount of communication is bounded (in the worst case). 

An even stronger model for the simulation of entanglement is for Alice and Bob to use as 
resource non-local boxes, rather than classical communication. Indeed, as discussed in Section [6.11 
one bit of classical communication can be used to realize a non-local box, but a non-local box 
cannot be used to communicate. It was shown in [42] that simulating projective measurements on 
a single ebit could be carried out with the use of a single non-local box. A unified approach to 
protocols simulating a single ebit with one bit of communication or with one non-local box was 
presented in [54J. 

7 Implementations 

7.1 Inefficient detectors 
7.1.1 The detection loophole 

In this section we put a constraint on the quantum model. We will suppose that any measurement 
on a quantum system gives the results predicted by quantum mechanics with probability rj, and 
does not give any result with probability 1 — r\. 

The motivation for considering this model is that most quantum communication experiments use 
photons. Photons are very practical because they can be quite easily produced, manipulated, trans- 
mitted over long distances, and measured. Unfortunately photons get absorbed during transmission 
(in commercial optical fibers, photons have approximately 50% probability of being absorbed af- 
ter travelling 15km), and single-photon detectors have limited efficiency: they will sometimes not 
detect a photon even though it is present. These effects can be described by the above model. 

In most experiments to date, the detector efficiency r\ was so low that the correlations could 
be explained by a classical model using shared randomness and no communication (a local hidden 
variable model). This is called the Detection Loophole [106] . Thus for instance in the CHSH 
experiment, the correlations can be explained by a local hidden variable model if r/ < 2/{y/2 + 1) ~ 
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0.8284. A detector efficiency better than this bound has not (yet) been achieved in experiments 
involving only photons. 

One solution to the above problem is technological: one should use a quantum system on 
which measurements can be carried out with high efficiency. In this respect atoms or ions are 
particularly interesting, because measurements on these systems can be carried out with essentially 
100% efficiency Thus experiments involving two entangled ions have been carried out in which 
the detection loophole was closed |115t I91j. However these experiments have not yet allowed both 
the detection loophole and the locality loophole (i.e., carrying out both measurements at spatially 
separated locations) to be closed simultaneously. 

Instead of (or in addition to) improved technology, another solution to this problem is to develop 
new non-locality tests that demonstrate non-locality with low detector efficiency. As we shall see 
in the following, the communication problems and protocols developed in the previous sections can 
be used to build such tests. 

7.1.2 Communication complexity and the detection loophole 

Communication complexity suggests that by increasing the dimension d of the entangled system 
under study, one can decrease exponentially (in d) the required efficiency of the detectors. Indeed, 
it appears that in many cases the minimum number c of bits of classical communication required to 
reproduce the quantum correlations is related to the minimum efficiency of the detectors required 
for the correlations to be non-local by rj > 2~°( c \ That there should be a relation between c and 
r\ was first noted in [71] and further studied in [87l EH [36] . 

To understand this relation we will compare two classical schemes: 

• In the first scheme, which was discussed at length in Sections [2] and [U the detectors have 
100% efficiency, the parties have shared randomness and may exchange up to c bits of classical 
communication. 

• In the second scheme, the parties have shared randomness, and each party has a detector of 
efficiency r\. This means that each party will with probability r/ give an output, and with 
probability 1—rj produce no output. The detectors are assumed to be independent, so that the 
probability that both detectors give an output is rj 2 . In the physics terminology this would be 
called a local hidden variable model with detector efficiency rj. (We will also consider below 
the case where one of the detectors has efficiency rj, and the other always gives a result, i.e., 
is 100% efficient.) 

These two schemes can be related in a number of ways. The simplest relation is: 

Any classical protocol with c bits of communication can be mapped into a classical 
protocol with no communication but with detector efficiency rj 2 = 2~ c . 

This mapping is very simple: Alice and Bob use shared randomness r which is uniformly distributed 
over all possible conversations. Each party checks whether r is a conversation that is consistent 
with their input. If it is then they give the corresponding output, if it is not then they don't give 
any output. The probability that both Alice and Bob give an output is at least 2~ c . 

This protocol is not perfect since the probability that the parties give an output may differ from 
one party to the other, or from one input to the other. What is interesting is that in a number of 
cases the converse holds: if the quantum correlations cannot be reproduced with less than c bits of 
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communication, then they can be reproduced without communication only if the detector efficiency 
rj is less than 2~ n ^ c \ 

A first example where this converse occurs, is when bounds on c and on the minimum detection 
efficiency rj can be obtained from the size of monochromatic rectangles (see Appendix [B] for a 
brief presentation of this notion). This approach was implicit in [87J where it was shown that the 
correlations of the distributed Deutsch-Jozsa problem could not be reproduced by a local hidden 
variable model if rj > 0( n 3 / 4 )2~°- OO35n when the inputs consist of n-bit strings, and hence the parties 
use a maximally entangled system of dimension n. Using the size of monochromatic rectangles was 
exploited more fully in [35] in the context of a multipartite communication complexity problem, and 
then extended in [36J to take into account the possibility of errors. In particular, in [36J it was shown 
how one could obtain a lower bound c > Br on the minimum amount of communication required to 
reproduce the correlations, where Br is a function of the size and discrepancy of rectangles. It then 
followed that the correlations could be obtained by a local hidden variable model with detectors of 
efficiency rj only if rj < 2~ BR l n (where n is the number of parties). If the rectangle lower bound on 
c is close to tight, then this implies the relation we mentioned above between c and rj. 

7.1.3 Asymmetric detection loophole 

Another interesting example arises if we suppose that Alice's detector is inefficient, but that Bob's 
detector is perfect. This situation is motivated by the experimental situation reported in [96], where 
an ion is entangled with a photon. As discussed above, the measurements on the ion can be done 
with 100% efficiency, whereas those on the photon will be inefficient. The problem in which Alice's 
detector is inefficient and Bob's detector is perfect was previously investigated from the point of 
view of the detection loophole in |401 [29] for entangled systems of dimension 2. 

We prove in Appendix [D] that the Hidden Matching problem is particularly well adapted to this 
asymmetric scenario. Namely we show that 

Suppose Alice and Bob try to implement the Hidden Matching problem using logn 
ebits, as discussed in Section [3T71 Suppose that Alice's detector has efficiency rj whereas 
Bob's detector has 100% efficiency. Then the correlations obtained by measuring the 
ebits cannot be reproduced by a classical model without communication if rj > 2~^^™\ 
even allowing for a small error probability. 

To our knowledge, this is the first time it is shown that an exponentially small detection efficiency 
can be tolerated when allowing for a small error probability. 

7.2 Present and future experiments 
7.2.1 Experimental quantum non- locality 

During the past decades there have been many experiments that studied the correlations exhibited 
by measurements on entangled quantum particles. Their main aim was to test quantum mechanics 
by comparing its predictions with those of hidden variable models. The short result is that the 
predictions of quantum mechanics have always been verified to very high precision. However, up 
to now some "loopholes" have always been left open, which allow the possibility of explaining the 
data with — admittedly contrived — local hidden variable models. 

We very briefly review how experiments on quantum non-locality have been improved during 
the past decades. We then discuss how the insights from communication complexity suggest new 
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experimental challenges. We also discuss experimental realizations of quantum communication 
complexity. 

After the initial experiment by Freedman and Clauser [63] on the correlations exhibited by en- 
tangled photons, the first qualitative advance was the experiments of Aspect that used time- varying 
analyzers in order to close the locality loophole. Indeed in previous experiments the measurements 
where kept fixed for long periods of time while experimental results were accumulated, then the 
measurements were changed and a new set of data was acquired for the new measurement setting. 
In Aspect's experiment [6] the measurement settings changed periodically in time. In the later ex- 
periment of Weihs et al. [132], the measurement settings were chosen at random using a quantum 
random number generator. 

Another important advance of the experiments of Aspect et al. [7J [8] was a very precise check 
that the measured correlations coincide with the correlations PQM{ab\xy) predicted by quantum 
mechanics for local measurements on a maximally entangled state of two particles (earlier experi- 
ments were much more imprecise). 

Some other noteworthy advances: 

• Non-locality experiments in which the two particles were separated by a large distance of 10 
km [H6] and 50 km [85]; 

• Non-locality experiments on bipartite entangled systems of dimension 3 [1301 fl25j; 

• Non-locality experiments on entangled states of three [1041 1109] and four particles [1161 H40"] . 

In all the above experiments the detection loophole was not closed. This means that the raw 
data acquired during the experiment could be explained by a local hidden variable model. It was 
only by making the (physically very reasonable) assumption that the events in which the detector 
gives a click are independent of the measurement settings and measurement results (known in the 
physics literature as the "fair sampling assumption") that these experiments could be assumed to 
be in contradiction with local hidden variable models. 

As mentioned above, there have now been two experiments involving ions in which the de- 
tection loophole has been closed. In the first, the two entangled ions were separated by about 3 
\im |115j . in the second, presented in more detail in Fig. [7J the two entangled ions were separated 
by about a meter [91] . In view of these advances, closing both the locality and detection loopholes 
simultaneously does not seem out of reach. 

From the point of view of communication complexity, closing the detection loophole is more 
important than closing the locality loophole. Indeed, if the detection loophole is not closed, it 
means that the raw data can be explained by a model without communication. On the other hand, 
if the detection loophole is closed, then, by sharing the entanglement, the parties have a resource 
that could only be reproduced classically by communication between the parties. The same is true 
in other applications of quantum non-locality: closing the detection loophole (but not necessarily 
the locality loophole) allows one to increase the security of quantum key distribution [2]. 

7.3 Future non-locality experiments 

The progress in quantum communication complexity points the way towards new tests of quantum 
non- locality which use not one ebit, as in the CHSH test, but many ebits. Ideas for these new tests 
come from the entanglement-based Deutsch-Jozsa problem discussed in Section|4l the entanglement- 
based Hidden Matching problem discussed in Section 13.71 recent work of Gavinsky [66] , and also 
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Figure 7: Bell inequality with two remote atomic qubits. The left-hand side is a schematic de- 
scription of the experiment reported in [91] in which the internal states of two ions separated by 
about one meter were entangled. Measurements on the two ions then allowed the violation of the 
CHSH inequality with the detection loophole closed. A series of laser pulses simultaneously excite 
both Yb + ions in such a way that when they deexcite, they emit a photon whose polarization is 
entangled with the ion. A lens is used to couple the photons into optical fibers. The wave plate 
(A/4) is used for convenience to convert circular polarization into linear polarization. The two 
photons interfere on a Beam Splitter (BS) and are detected by Photo Multiplier Tubes (PMT). 
Simultaneous detection of a photon by the two PMT's signals that the photons were in a Bell state, 
thereby realizing entanglement swapping: the two ions are now entangled. The internal states of 
the ions are then measured, enabling a violation of the CHSH inequality. Note that there are many 
inefficiencies in this experiment: only a fraction of emitted photons are coupled into the optical 
fibers, and only a fraction of the photons reaching the PMT's are detected. But when two photons 
are detected, one knows with certainty that the two ions are entangled. The right hand side is a 
photograph of one of the ion traps. The other trap is similar, and located about one meter away 
on the same optical table. (Both figures courtesy of S. Monroe and D. Matsukevich; left-hand side 
panel copyright American Physical Society). 
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the (non-constructive) results on three-party correlations reported in |107| . There are at least two 
motivations for such experiments. First of all they could be more robust against experimental im- 
perfections (such as the detection loophole or errors) than non- locality tests used at present. Second 
they could illustrate the efficiency of quantum mechanics over classical mechanics, as experiments 
on a small number e of ebits could only be reproduced classically using an exponentially large (in 
e) amount c of classical communication. 

These non-locality experiments on a many ebits can be characterized by several parameters. 
In particular these would include the number e of ebits involved, or equivalently the dimension 
d = 2 e of the entangled quantum system; the minimum detector efficiency r] required for the 
correlations to be non-local; the amount e of errors that can be tolerated; and the amount c of 
classical communication that would be required to reproduce the quantum correlations. In general, 
for any given non-locality test, we can expect tradeoffs between 77, e and c. 

An important point to note is that the proposals inspired by communication complexity typically 
are asymptotic results that deal with the limit where the number of ebits tends towards infinity: 
e — > 00. However real experiments will deal with small values of e. For instance, if we think of 
the detection loophole, one should recall that this is only a problem for experiments dealing with 
entangled photons. On the other hand, the Hilbert space of a single photon can be larger than 2. 
One can thus effectively manipulate more than one qubit, while manipulating only a single photon. 
This is potentially an interesting opportunity. Indeed it would be very interesting to devise non- 
locality experiments that tolerate inefficient detectors (say rj < 10%) in Hilbert spaces of moderate 
dimension (say d = 10). If one could devise such a non-locality experiment, there would be a strong 
incentive to realize it experimentally. Indeed whereas experiments involving entangled atoms or 
ions may be the short-term solution to solving the detection loophole, such experiments are much 
slower and much more expensive than experiments involving photons only. Numerical searches for 
such a non-locality experiment have been undertaken, but unsuccessfully so far |90j . 

In summary, quantum communication complexity suggests the possibility of new non-locality 
experiments on a moderate number of ebits that either are very resistant to imperfections, or require 
very large amounts of classical resources to reproduce classically. Realizing such experiments will 
require further progress on the theoretical and experimental side. 

7.3.1 Experimental communication complexity 

The experimental situation concerning communication complexity proper is less advanced. Indeed, 
in order to carry out any nontrivial experimental demonstration of communication complexity, one 
needs to take into account the limited efficiency of detectors which has been such a plague for non- 
locality experiments. In this respect, the first convincing communication complexity experiment to 
date is that reported in [128J in which 6 parties, materialized by waveplates along a beam on an 
optical table, carried out the communication complexity problem proposed in [35j [Ml [31] , but in the 
version proposed in [64], which does not use entanglement. In this experiment the limited efficiency 
of detectors was explicitly taken into account. Experiments that studied the entanglement-based 
version of this problem while explicitly taking into account the limited efficiency of the detectors 
have also been reported [139] . based on the proposal of |41j . 

Another protocol which has been studied experimentally is quantum fingerprinting which in the 
SMP model performs exponentially better than classical protocols (see Section [5,3p . The possibility 
of realizing such an experiment at a small scale involving one or a few photons has been discussed 
in [531 EH], and later performed using photons [76] and in NMR [SB]. 
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In the future we may expect further proof-of-principle experiments of quantum communica- 
tion complexity involving the exchange of more qubits and larger distance between the parties. 
Good candidates for such experiments are Raz's communication complexity problem, the Hidden 
Matching problem and its extensions, and quantum fingerprinting. 

8 Conclusion 

8.1 Open questions 

Quantum communication complexity and quantum non-locality are by now mature fields. But 
many questions remain open. Here we collect a few. 

1. Additional natural problems in quantum communication complexity. Find ad- 
ditional problems — if possible natural problems that could have potential applications — for 
which quantum communication is much more efficient than classical commmunication. 

2. How much entanglement is needed to get a reduction of communication: equiv- 
alence of quantum communication and entanglement models of communication 
complexity. In the entanglement model of communication complexity, the parties have an 
unlimited supply of entanglement and use it to reduce the amount of classical communication. 
How much entanglement is really needed? In classical communication complexity with shared 
randomness Newman's Theorem [101 j states that, if we allow a small increase in the error 
probability, the parties need only have O(logn) shared random bits (where n is the size of the 
inputs). Does something similar hold when we replace shared randomness by entanglement? 
Answering this question would essentially establish whether the quantum communication and 
the entanglement models of communication complexity are equivalent 

3. Are most quantum states useful for communication complexity? It was recently 
shown in [73] that most n-qubit states (with respect to the uniform measure) are not useful — 
they are typically too entangled — in the measurement-based version of quantum computation. 
Are most states useful for communication complexity? For two parties the answer is yes, as 
they can work in the Schmidt basis. But consider three parties sharing a random state 
of 3n qubits (each party having n qubits). How useful are most states for communication 
complexity (asymptotically as n tends to infinity)? 

4. Find new non-local games, qualitatively different from existing ones. In particular 
consider the following more specific subquestions: 

• For two-party XOR games, the ratio between the classical and the quantum value of 
the game is bounded by a constant. However in [107J it was shown — using a non- 
constructive proof — that this is not the case for three-party games. Can one exhibit an 
explicit example of this type? 

• Find Bell inequalities involving rather small systems, say where the dimension of each 
party's Hilbert space is less than d = 10, which allow for very small detector efficiencies. 

5. Non-local boxes and communication complexity. As discussed in Section 16. 1\ non- 
local boxes are an interesting resource to consider from the point of view of communication 
complexity. In this regard, two interesting questions are: 
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• First, what is the noise threshold below which non-local boxes make communication 
complexity trivial (see [22] for a formulation of this problem). Is this threshold the 
maximum value p = (2 + v / 2)/4 attainable by local measurements on entangled quantum 
systems? 

• Second, is it possible to amplify non-local correlations, in the sense that given a large 
number of devices that will produce correlations P(ab\xy) corresponding to PR boxes 
with noise p, is it possible to use the devices in such a way as to produce correlations 
with a lower value of p? A first result in this direction can be found in [61] . 

6. Simulation of quantum correlations and quantum communication. In this context, 
some questions that come to mind are: 

• Exact simulation of more than one qubit or ebit using bounded classical communication 
(in the worse case) or Non-Local Boxes. Some preliminary results on this topic have 
been obtained in the particular case where Alice and Bob carry out measurements with 
binary outcomes [55j H13j . 

• The simulation of non-maximally entangled states using non-local boxes. This appears 
to be much harder than the simulation of maximally entangled states, see [281 [27] for 
some first results. 

• The simulation of multipartite non-local correlations. 

8.2 What have we learnt from quantum communication complexity? 

Communication complexity is a task for which quantum information can beat classical informa- 
tion. Such tasks are rare, and finding more potential applications of quantum information is very 
important. 

Unfortunately most quantum communication complexity problems are either extremely sensitive 
to noise, highly contrived, or do not offer exponential gains over the best classical protocols (in 
which case the advantages of quantum communication will probably be more than offset by the 
lower cost and higher speed of classical communication). The most interesting proposal so far is 
maybe the SMP model without shared randomness (a somewhat contrived model) where equality (a 
very natural problem) can be solved exponentially more efficiently using quantum communication. 
Thus there is the tantalizing possibility that some time in the future, quantum communication 
complexity could be used in practical applications. 

Independently of whether quantum communication complexity ever finds some real- world ap- 
plications, the results obtained so far have important conceptual implications. First of all they 
offer new insights into the power of quantum information, and in particular of quantum computing. 
Indeed the basic aim of computer science, taken in a wide sense, is to accomplish a task by using the 
minimum amount of resources. In the usual formulation, the resource that we want to minimize is 
the running time of the computer. This is the most important application of quantum computing 
as Shor's algorithm suggests that a quantum computer would allow exponential speedups. But 
in this context it is very difficult — if not impossible — to prove that quantum computers are more 
powerful than classical computers. The advantage of quantum computation can however be proven 
in simpler contexts such as the black-box model of quantum computing, where the resource that 
is quantified is the number of calls to an oracle; or communication complexity where the resource 
that is quantified is the amount of communication. The existence of these models where it can be 
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rigorously shown that quantum information offers important advantages over classical information 
reinforces our confidence that quantum computers are much more powerful than classical computers 
for certain tasks. 

Second, the study of quantum communication complexity has led to the proposal of new tests of 
quantum mechanics. Indeed from Bell onwards it was known that if one wants to replace quantum 
mechanics by a classical model, this classical model would have to use faster than light signalling. 
The discovery of fast quantum algorithms suggested that such a classical model would use an 
exponentially large number of resources. Quantum communication complexity has now advanced 
to the point where it may be possible to propose experiments in which one can prove that a classical 
simulation would require exponentially more resources than are used quantum mechanically. 

In summary, quantum communication complexity is now a mature field that has led to some 
fundamental insights into the nature of computation and the foundations of physics. 
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A Nayak's Proof of a Consequence of Holevo's Bound 

Here we prove that if we are encoding n bits in d-dimensional quantum states, then the average 
recovery probability is at most d/2 n . Therefore, an exact procedure requires d > 2 n , and thus at 
least n qubits. 

Let po, . . . , p2 n -i be the d-dimensional states that encode the elements of {0, l} n (which we iden- 
tify with {0, 1, . . . , 2™ — 1} in the obvious way). Let Eq, . . . , i?2 n -i be the measurement operators 
applied for decoding (they sum to the d-dimensional identity). The probability of successfully re- 
covering x 6 {0, l} n from its encoding is Tr(E x p x ). Therefore, we can bound the success probability 
for a uniformly random x £ {0, 1}" by 

2 n -l 2 n -l 



x=0 x=0 

i r^ 1 



2<> 
1 

2 n 

(30) 



. x=0 

-Tr(J) 

d_ 
2 n ' 



The first inequality follows because the density operator p x is positive semi-definite and has trace 1, 
therefore it can be unitarily diagonalized: U* p x U = D, where D is diagonal with diagonal entries 
that are non-negative and sum to 1. Because the trace is invariant under cyclic permutations of the 
matrices, we now have Tr(E x p x ) = Tr(U*E x UU*p x U) = Tt(U*E x UD) < Tr(U*E x UI) = Tr(E x ). 



B Rectangles and the Lower Bound for Distributed Deutsch-Jozsa 

Separations between quantum and classical communication complexity always require two things: 
an efficient quantum protocol for some problem, and a lower bound on the communication of all 
classical protocols solving that same problem. In this appendix we will give some tools for lower 
bounding classical communication complexity, leading eventually to the lower bound on classical 
protocols for the Distributed Deutsch-Jozsa problem that we mentioned in Section 13.41 



B.l Rectangles 

Consider some communication complexity problem / : X x Y — > {0, 1}, where Alice starts with 
an input x £ X and Bob starts with an input y £ Y. We start by introducing the crucial 
combinatorial notion for classical lower bounds. A rectangle is a set R C X x Y that is of the 
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form R = A x B with A C X and B C F. For example, if n = 2 and ^ = {00, 01}, B = {01, 10} 
then R = Ax B = {(00,01), (00, 10), (01,01), (01, 10))} is a rectangle. The following result is a 
fundamental property of classical deterministic protocols. 

Lemma 1. If a deterministic protocol has communication c, then there exist 2 C rectangles R±, . . . , i?2 c 
that partition X x Y, such that the protocol gives the same output ai for each (x,y) £ Ri. 

We omit the easy proof of this lemma, which is by induction on c. For example, suppose there 
is only one fc-bit message m going from Alice to Bob and then Bob returns the 1-bit output. Then 
the 2 k+1 rectangles would be of the form R m ,a = A m x Y m ^ a , with m € {0,l} k and a G {0,1}, 
where A m is the set of x's for which Alice sends fc-bit message m, and Y ma is the set of y's for 
which Bob returns output a when receiving message m. Note that if our protocol computes / 
correctly, then the rectangles are "monochromatic": the protocol returns the same answer f(x,y) 
for all (x, y) £ R{. 

As a simple application of this we prove the so-called "rank lower bound". Consider some 
communication complexity problem / :Ix7-t {0, 1}. Let Mj be the \X\ x \Y\ matrix whose 
entries are defined by Mf(x,y) = f(x,y). This is called the communication matrix of /. It can 
be viewed as a 2-dimensional truth table. We use rank(/) to denote the rank of this matrix over 
the field of real numbers. For example, the communication matrix for the equality function is the 
2 n x 2 n identity matrix, which has Is on its diagonal and 0s elsewhere. Hence rank(EQ) = 2 n . 

Suppose we have some c-bit deterministic protocol that computes /. We know that this parti- 
tions the input space X x Y into rectangles R±, . . . , i?2 c - Since each 1-input (x, y) occurs in exactly 
one 1-rectangle, we have 

M f =J2 Ri> 

i:a,i=l 

where we view Ri as a 2 n x 2 n matrix with Is on its elements and 0s elsewhere. Note that Ri is a 
matrix of rank 1. Hence, using rank(^4 + B) < rank(^4) + rank(B), we get 

rank(M/) = rank ( ^ ^ j - rank (^) = 1 ^ 2 °- 

\i:a,i=l / i:cii=l i:aj=l 

But that means that a lower bound on the rank of My implies a lower bound on the communication! 
In particular, it follows that for the equality problem, the communication c needs to be at least n. 

B.2 Randomized protocols 

In a randomized protocol, Alice and Bob may flip coins and the protocol has to output the right 
value f(x,y) with probability > 2/3 for all (x,y) £ V. We can fix these coins to obtain a deter- 
ministic protocol. Suppose randomized protocol A uses c bits of communication and has success 
probability 2/3 on all inputs. Let A(x, y, rA, re) = 1 if the protocol gives the correct output f(x, y) 
on input x,y using coin flips r\ for Alice and re for Bob, and A(x,y,rA,rs) = otherwise. For 
each x, y, a good randomized protocol satisfies 

^■r A ,r B [A(x,y,r A ,r B )} > 2/3, 

where the expectation is taken over uniformly chosen strings r\ and r#. Now let \i : {0, l} n x 
{0, l} n — > [0, 1] be an input distribution. Then also 

^^r A ,r B [A{x,y,r A ,r B )] > 2/3, 
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where the expectation is taken over r^, re, and x, y picked according to ji. By the averaging princi- 
ple, there exists a way to fix va and re such that the success probability (under fi) of the resulting 
deterministic protocol is at least 2/3. Accordingly, if we want to lower bound the randomized 
communication complexity of a function, it suffices to find some "hard" input distribution /j,, and 
to show that all deterministic protocols that have error at most 1/3 under that distribution, need 
a lot of communication. 

The reason why the step to deterministic protocols is helpful, is that deterministic protocols 
partition the input space into rectangles as we've seen before. Suppose we can show that all "large" 
rectangles in the communication matrix have roughly as many Os as Is in them (weighed according 
to jj). Then the protocol will make a large error on all large rectangles. Conversely, if we know 
the protocol does not make a large error, most of its rectangles must have been "small" . But that 
can only be if there are many rectangles. Since the number of rectangles is 2 C , the communication 
c must have been large. This idea leads to the following lower bound method. The discrepancy 
of rectangle R = A x B under [i is the difference between the weight of the Os and the Is in that 
rectangle: 

S^R) = \»(R n f-\i)) - »(R n f-\0))\ 

The discrepancy of / under \i is the maximum over all rectangles: 

<V(/) = max^-R). 

If / has small discrepancy, that means that all "large" rectangles are roughly balanced. Suppose a 
deterministic protocol partitions the input space into rectangles R\, . . . , i?2 c - Suppose it has success 
probability 1/2 + e. The best bias (difference between success and failure probabilities) that the 
protocol can achieve on rectangle is 5 Mi (Ri), by giving the output with highest weight in that 
rectangle. The success probability is Yli^i^i n / H* 2 *)) an d the error probability is Ylil^iRi n 
/ _1 (1 — dj)), where ai is the majority value of / on the pairs (x,y) £ Ri, weighted according to \i. 
Hence we have 

2 c 2 C 2 C 

2e < nrVi)) - 2>(fl< n r\i - a,)) < f^ORi) < 2 C <5 M (/). 

i=l 1=1 i=l 

This is a lower bound on the communication: c > log(2e/<5^(/)). Accordingly, a distribution fi 
where <5 M (/) is small gives a lower bound on the communication of deterministic protocols for / 
under fx, and then the same lower bound applies to randomized protocols. 

B.3 Discrepancy of the inner product function 

To illustrate the discrepancy lower bound technique, we now consider the inner product function, 
defined by IP(x, y) = x-y (mod 2). We will show that its discrepancy under the uniform distribution 
is very small. We analyze the 2 n x 2 n matrix M whose (x,y) entry is (— l) x ' y . This is just the 
communication matrix for IP, with Os replaced by Is, and Is replaced by —Is. Lindsey's lemma 
shows that large rectangles in M are quite balanced: 

Lemma 2 (Lindsey). For every rectangle R = Ax B , the absolute value of the sum of the M -entries 
in that rectangle is at most y/\A\ • \B\ • 2™. 
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Proof: It is easy to see that M is symmetric and M 2 = 2 n I. This implies, for any vector v, 

\\Mvf = v T M T Mv = 2 n v T v = 2 n \\v\\ 2 , 

where the norm is the usual Euclidean vector length. Let va E {0, l} 2 ™ and vb E {0, l} 2 ™ be 
the characteristic (column) vectors of the sets A and B. The sum of the M-entries in R is 
YlaeA beB M a b = v t a Mvb- We can bound this using Cauchy-Schwarz: 

\v T A Mv B \ < IMI • \\Mv B \\ = \\v A \\ ■ V2"\\v B \\ = ■ \B\ -2 n . 



Let fi(x, y) = l/2 2n be the uniform input distribution. Note that the discrepancy of the rectangle 
R under \i is exactly the difference of +l's and — l's in R, divided by 2 2n . By Lindsey's lemma, 
this is 5fj,(R) < ^J\A\ ■ \B\/2 Zn l 2 . Because \A\, \B\ < 2 n , it follows that the discrepancy of the inner 
product function under the uniform distribution is 6^ (IP) < 2~ n l 2 . Hence we get an/2 lower bound 
on the randomized communication complexity of IP. 

B.4 The lower bound for the Distributed Deutsch-Jozsa problem 

Recall the Distributed Deutsch-Jozsa problem from Section 13.41 Buhrman, Cleve, and Wigder- 
son |33] used a combinatorial result of Frankl and Rodl [62] to prove the following classical lower 
bound: 

Theorem 3. Every deterministic classical protocol that solves the Distributed Deutsch-Jozsa prob- 
lem, needs to communicate at least 0.007n bits. 

Proof: Suppose there is a c-bit deterministic classical protocol for the problem. Each c-bit con- 
versation corresponds to a rectangle R = A x B, with A, B C {0, l} n , such that the protocol has 
the same conversation and output if, and only if, (x, y) E R. Since there are at most 2 C possible 
conversations, the protocol partitions {0, 1}™ x {0, l} n in at most 2 C different such rectangles. Now 
consider all ra-bit strings x with Hamming weight n/2 (i.e., n/2 ones and n/2 zeroes). There are 
(n/2) ~ °f those. Since every [x, x)-pair must occur in some rectangle and there are only 2° 

rectangles, there is a rectangle R = A x B that contains at least 2 n /(y / re2 c ) different such (x,x)- 
pairs. Let S = {x : \x\ = n/2, (x,x) E R} be the set of such x. Since R contains some (x,x)-pairs 
(on which the protocol outputs 1) and the protocol has the same output for all inputs in R, R 
cannot contain any 0-inputs. This implies that the Hamming distance of every pair x, y E S is 
different from n/2, for otherwise (x,y) would be a 0-input in R. Viewing the strings x in 5 as 
characteristic vectors of sets, it is easy to see that the size of the intersection of x, y E S is never 
n/4. Thus we have a set system S of at least 2 n /y / n2 c sets over an n-element universe, such that 
the size of the intersection of any two sets in S is not n/4. However, by Corollary 1.2 of |62| . such 
a set system can have at most 1.99 n elements, so we have 

2 n 

" < |5| < 1.99™. 



^2 C 

This implies c > log(2 n /^/nl.99 n ) > 0.007 n. 



5S 



C Razborov's Lower Bound for the Quantum Communication Com- 
plexity of Intersection 



While the previous section discussed some basic methods for lower bounding classical communi- 
cation complexity, here we focus on methods to lower bound quantum communication complexity 
(sometimes with prior entanglement). 



C.l The Kremer-Razborov-Yao lemma and its consequences 



The following lemma is due to Razborov [1121 Proposition 3.3] and is similar to earlier statements 
by Yao |137j and Kremer [82J . It can intuitively by viewed as a quantum analogue of the rectangle- 
decomposition of classical protocols that we explained in Section IB. It We skip the easy proof, 
which is by induction on q. 

Lemma 4 (Kremer-Razborov-Yao). Let |\&) denote the (possibly entangled) starting state of a 
quantum protocol that communicates q qubits of communication and has binary output. For all 
inputs x of Alice and y of Bob, there exist linear operators Afi(x), B^y), for all h G {0, l} 9-1 , 
each with operator norm (i.e., largest singular value) at most 1, such that the acceptance probability 
(i.e., probability of output '1 ') of the protocol is 



P(x,y) 



Y, (A h (x)®B h (y))\V) 
he{o,i}'?~ 1 



where the norm is the usual Euclidean vector length. 

Consider the special case where the protocol starts without entanglement, so we can write 
|\&) = | I ■ In this case we can rewrite the acceptance probabilities as 

2 

(A h (x)®B h (y))\* A )\* B ) 



P(x,y) 



E 

/iSiO,!}?- 1 



(*a|(*b| 



Kh'&iO,!} 1 !- 1 




(A h (x) ® B h {y)) J • Yl ( A h'(x) ® B h ,(y)) 
{* A \A h (x)*A h ,(x)\* A ) ■ (*B\B h (y)*B h ,(y)\* B ). 



\Va)\*b) 



Let a(x) be the 2 2lJ_2 -dimensional row vector with (h, /t')-entry equal to (^A\Ah(x)*Ah>(x)\^A), 
and similarly define column vector b(y) with entries B\Bh(y)* By {y)\^ b) > then the last expression 
is just the scalar product a(x)b(y). If we now define A to be the \X\ x 2 2q ~ 2 matrix with rows a(x), 
and B the 2 2g ~ 2 x \Y\ matrix with columns b(y), then we have proved the following lemma. 

Lemma 5. Consider a quantum communication protocol (without prior entanglement) on input-set 
X x Y, that communicates q qubits, with acceptance probabilities denoted by P(x,y), and P the 
corresponding \X\ x \Y\ matrix. There exist \X\ x 2 2q ~ 2 matrix A and 2 2q ~ 2 X \Y\ matrix B, both 
with entries of absolute value at most 1, such that P = AB. 
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Note that the rank of matrix P is at most 2 2q ~ 2 , since iank(AB) < min(rank(j4), rank(2?)). This 
allows us to generalize the classical rank lower bound from Section[Bj]to ^he quantum domain. If we 
have a g-qubit protocol that computes some function f : X xY ^ {0,1} with success probability 1, 
then P(x,y) equals f(x,y), and the \X\ x \Y\ matrix P is actually the communication matrix Mf, 

whose (x,y) entry is f(x,y). Hence we obtain a lower bound q > ran j> c ( p ) + 1 = ran ^ / /) _|_ \ 
on the quantum communication of protocols with success probability 1. Similarly, one can obtain 
lower bounds on the bounded-error quantum communication complexity by lower bounding the 
rank needed for a matrix P that is close to the matrix of function values at each entry (since an 
e-error protocol satisfies \P(x,y) — f(x,y)\ < e for all inputs). 

Finally, let us note without proof that one can also use the discrepancy method (Section IB.2|) 
to lower bound quantum communication complexity [82] . even for protocols with prior entangle- 
ment [84J. Since the Inner Product function has very small discrepancy (Section lB.3p . we thus have 
another way of showing a linear lower bound for it, different from the one explained in Section T3.81 



C.2 Translation from protocols to polynomials 

The following key lemma is implicit in Razborov's paper |112j ; the presentation we give here is 
taken from [80]. It allows us to translate the average acceptance probability of a g-qubit protocol 
(as a function of the intersection size i of the inputs x and y, viewed as subsets of {1, . . . , re}) to a 
polynomial in i of degree roughly q. Accordingly, efficient protocols give low-degree polynomials. 

Razborov's proof relies on the following linear algebraic notions. The operator norm \\A\\ of a 
matrix A is its largest singular value o\ (not to be confused with the Euclidean vector norm of 
Lemma 0J. The trace inner product — also known as Hilbert-Schmidt inner product — between A 
and B is (A, B) = Ti(A*B). The trace norm is \\A\\ tr = max{\{A,B)\ : \\B\\ = 1} = Yli a ii the sum 

of all singular values of A. The Frobenius norm is \\A\\ F = \Aij\ 2 = \J^Zi °f ■ 

Lemma 6. Consider a quantum communication protocol (without prior entanglement) on n-bit 
inputs x andy, that communicates q qubits, with acceptance probabilities denoted by P(x,y). Define 

P(i) = ^\x\=\y\=n/4,\xAy\=i[ P (x,y)], 

where the expectation is taken uniformly over all x,y that each have weight n/4 and that have 
intersection i. For every d < n/4 there exists a degree-d polynomial q such that \P(i) — q(i)\ < 
2 2?-(d/4) f or aUi g {o, . . . , re/8}. 

Proof: We only consider the J\f = ( n ™ 4 ) strings of weight n/4. Let P denote the J\f x J\f matrix 
of the acceptance probabilities on these inputs. By Lemma we can write P = AB, where A 
is an x 2 2<?_2 matrix with each entry at most 1 in absolute value, and similarly for B. Note 
that ||-A||^r, H-Bllji < VJ\f2 2q ~ 2 . By the Cauchy-Schwarz inequality for unitarily invariant norms |19] 
p. 95], we have 

\\P\\ tr <\\A\\ F -\\B\\ F <M2 2ci - 2 . 

Let Hi denote the M x M matrix corresponding to the uniform probability distribution on {(x, y) : 
\x A y\ = i}. These "combinatorial matrices" have been well studied [81J. Note that {P, Hi) is the 
expected acceptance probability P(i) of the protocol under that distribution. One can show that the 
different Hi commute; thus they have the same eigenspaces Eq, . . . , E n u and can be simultaneously 
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diagonalized by some orthogonal matrix U. For t G {0, . . . , ra/4}, let (UPU T )t denote the block of 
UPU T corresponding to Et, and let at = Tv((UPU T )t) be its trace. Then we have 



n/4 



£>l <^\(UPU T U < \\UPU T \\ tr = \\P\\ tr < Af2 2q ~ 2 , 



t=Q 



where the second inequality is a property of the trace norm. 

Let An be the eigenvalue of [ii in eigenspace Et- Knuth [81] gives an exact combinatorial 
expression for A#. We will not state this explicitly here, but just note that \t is a degree-t 
polynomial in i, and that \Xit\ < 2~*/ 4 /AA for i < n/8. Now consider the high-degree polynomial p 
defined by 

n/4 

p{i) = y~] (H^it- 
t=0 



This satisfies 



n/4 



p(i) = ^Tr(([/P[/ T ) t )A 4t = (UPU T ,U^U T ) = (P,m) = P(i). 



t=o 



Let g be the degree-d polynomial obtained by removing the high-degree parts of p: 



q(i) = ^JotAjt. 
t=o 

Then P and g are close on all integers i between and n/8: 
|P(i)-g(i)| = b(*)-?(i)l : 



n/4 
t=d+l 



9 -d/4 n / 4 

<f^^| at |<2- rf / 4 + 2 <?. 



t=0 



C.3 The quantum lower bound for Intersection 

Now suppose we have a q-qubit protocol for the Intersection problem, say with error probability at 
most 1/3 on every input x, y. Our goal is to show that q is at least about sfn. Since the protocol 
outputs 1 with high probability if, and only if, x and y intersect in at least one point, we know the 
following about the quantity P(i) = E| a .|_| J ,|_ 7l /4 ) | a . Aj ,| = i[P(x,y)]: P(0) G [0,1/3] and P(i) G [2/3,1] 
if i G {1, . . . , re}. 

This P(i) is only defined on integers, but by Lemma [6] we can approximate it up to some 
small additive error e using a polynomial q of degree d = 8q + [41og(l/e)] . Then we know q(0) G 
[— e, 1/3 + e] and q(i) G [2/3 — e, 1 + e]. However, the following result of Ehlich and Zeller [59] and 
Rivlin and Cheney [114] says that such a polynomial q must have degree about y/ri: 

Theorem 7 (Ehlich & Zeller; Rivlin & Cheney). Let p : K — > R 6e a polynomial such that 
°i < < ^2 / or every integer < i < N , and the derivative p' satisfies \p'(x)\ > c /or some rea/ 
< x < N. Then the degree of p is at least w cN/ (c + 62 — b\ ) . 
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It thus follows that the original protocol must have communicated at least about ^/n qubits. 
In his paper, Razborov gives essentially tight lower bounds not just for the Intersection problem, 
but for any communication problem that depends only on the size of the intersection of the inputs 
x and y. This combines Lemma [6] with a polynomial degree lower bound due to Paturi [105J. 
The lower bound proof we gave here only applies to quantum protocols that do not start with an 
entangled state, but Razborov showed the same lower bound for protocols with prior entanglement, 
at the expense of some more technical complication. Recently, an alternative proof was obtained 
by Sherstov [122] . 

D Asymmetric Detection Efficiency 

Here we prove the results stated in Section 17.1.31 concerning the connection between asymmetric 
experiments where a single detector is inefficient, and classical protocols with perfect detectors that 
use one-way communication, i.e., where all the communication takes place from Alice to Bob. 

Let us suppose that in order to reproduce the quantum correlations using one-way communica- 
tion from Alice to Bob and shared randomness, c e / bits of communication are required to reproduce 
the correlations with error e'. More precisely, the error is measured as the total variational distance 
between the predictions of quantum theory Pqm {ob\xy) and the output Pdass{ a b\xy) of the classical 
protocol: 



Let us also suppose that there exists a protocol that uses only shared randomness (a local 
hidden variable model) in which Alice's detector has efficiency rj e and Bob's detector is perfect, 
and that reproduces the quantum correlations with error e. More precisely the fact that Alice's 
detector has efficiency rj means that P(_L b\xy) = r\ independently of b, x, y, where _L corresponds to 
Alice's detector not giving a result. The error is measured as the total variational distance between 
the predictions of quantum theory PQM{ab\xy) (when the detectors are 100% efficient) and the 
predictions PLHv(ab\xy) of the LHV model. We divide the latter by rj to take into account that 
Alice's detector gives a result with probability ij: 



Then we have: 

Theorem 8. With the above hypothesis, we have rj e < 0((— lne)2 _C2t ). 

To prove this, we use the local hidden variable model (LHV) model with detection efficiency r/ e 
to construct a classical protocol with communication. The LHV uses shared randomness r. Alice 
and Bob share k independently chosen instances of the shared randomness r%, r%, ■ ■ ■ , r^. Alice 
checks whether she should give an output for at least one value of the shared randomness. This 
occurs with probability 1 — (1 — 7]) k . If so, she sends Bob the index j of the shared randomness rj 
for which she gives an output (using log A; bits of communication), and they give the corresponding 
output. If there is no instance of the shared randomness for which Alice should give an output 
in the LHV model, Alice gives a random output and sends Bob a random index j. This occurs 
with probability (1 — rj) k , and in this case Alice and Bob's results will most likely be completely 




ab 




ab 
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different from those predicted by quantum mechanics. The error probability in the model with 
communication is thus P(error) < (1 — (1 — rf) k )e + (1 — r/) k < e + (1 — rf) k . Let us take k = j^zfii^y, 

In 6 

then the error is bounded by P(error) < e + (1 — r/) ln ( 1 ~''' = 2e. But we know that to produce 
the correlations with error 2e we need at least C2 e bits of one-way communication, hence k > 2 C2e . 
Therefore — ln(l — 77) < (— lne)2 _C2s , which implies the result. 

(Note that the above mapping does not hold when both Alice and Bob's detectors are inefficient, 
since if they try the above procedure, they will need to find a value of the shared randomness rj for 
which both their detectors produce an output, i.e., solve an instance of the Intersection problem.) 

Let us apply this result to the Hidden Matching problem. As mentioned in Section 13.71 this 
problem can be solved using logn ebits and logn bits of classical communication from Alice to 
Bob; but if classical communication from Alice to Bob is considered, then at least Sl{y/n) bits 
of communication are required, even allowing for a small error probability. This implies that the 
correlations obtained by measuring the ebits can only be reproduced using at least f2(y / n) bits 
of classical communication from Alice to Bob, even allowing for a small error probability. The 
above result then shows that these correlations remain non-local (i.e., cannot be reproduced by a 
classical model without communication) if Bob's detector has 100% efficiency and Alice's detector 
has efficiency rj > 2 _ri (v / ") ) even allowing for a small error probability. 
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